Search this site


Metadata

Articles

Projects

Presentations

Making iptables changes atomically and not dropping packets.

I'm working on rolling out iptables rules to all of our servers at work. It's not a totally simple task, as many things can go wrong.

The first problem is the one where you can shoot yourself in the foot. Install a new set of rules for testing on a remote server, and suddenly your ssh session stops responding. I covered how to work around that in a previous post.

Another problem is ensuring you make your firewall changes atomically. All rules pushed in a single step. In linux, if you have a script with many lines of 'iptables' invocations, running it will make one rule change per iptables command. And what if you write your rules like this?

# Flush rules so we can install our new ones.
iptables -F

# First rule, drop input by default
iptables -P INPUT DROP

# Other rules here...
iptables -A INPUT ... -j ACCEPT
iptables -A INPUT ... -j ACCEPT
If your server is highly trafficked, then the delay between the 'DROP' default and accept rules can mean dropped traffic. That sucks. This is an example of a race condition. Additionally, there's a second race condition earlier in the script where, depending on the default rule for INPUT, we may drop or accept all traffic for a very short period. Bad.

One other problem I thought could occur was a state tracking problem with conntrack. If previously we weren't using conntrack, what would happen to existing connections when I set default deny and only allowed connections that were established? Something like this:

iptables -P INPUT DROP
iptables -A INPUT -i eth0 -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
iptables -A INPUT -p tcp -m tcp --dport 22 --syn -j ACCEPT
I did some testing with this, and I may be wrong here, but it does not drop my existing sessions as I had predicted. This is a good thing. Turns out, when this runs, the 'conntrack' table is populated with existing connections from the network stack. This further helps us not drop traffic when pushing new firewall rules. You can view the current conntrack table in the file /proc/net/ip_conntrack.

What options do we have for atomically applying a bunch of rules so we don't drop traffic? The iptables tool set comes with 'iptables-save' which lets you save your existing iptables rules to a file. I was unable to find any documentation on the exact format of this file, but it seems easy enough to read. The output includes rules and counters for each table and chain. Counters are optional.

All the documentation I've read indicates that using 'iptables-restore' will apply all of the rules atomically. This lets us set a pile of rules all at once without any race conditions.

So I generate an iptables-restore file and use iptables-restore to install it. No traffic dropped. I'm generating it with a shell script, so there was one gotcha - I basically take iptables commands and output them to a file. I do this with a shell function I wrote, called 'addrule'. However, I have some rules like this:

addrule -A INPUT -p tcp -m limit --limit 5/min -j LOG --log-prefix "Denied TCP: " --log-level debug
I quoted the argument in the addrule invocation, but we need to also produce a quoted version in our iptables-restore rule file, otherwise --log-prefix will get set to 'Denied' and we'll also fail because 'TCP:' is not an option iptables expects. It appears to be safe to quote all arguments in the iptables-restore files except for lines declaring chain counters (like ':INPUT ACCEPT [12345:987235]'), defining tables (like '*filter'), or the 'COMMIT' command. Instead of quoting everything, I just quote everything with spaces in an argument.

The fix makes my 'addrule' function look like this:

rulefile="$(mktemp)"

addrule() {
  while [ $# -gt 0 ] ; do
    # If the current arg has a space in it, output "arg"
    if echo "$1" | grep -q ' '  ; then
      echo -n "\"$1\""
    else
      echo -n "$1"
    fi
    [ $# -gt 1 ] && echo -n " "
    shift
  done >> $rulefile
  echo >> $rulefile
}

# So this:
#   addrule -A INPUT -j LOG --log-prefix "Hello World"
# will output this to the $rulefile
#   -A INPUT -j LOG --log-prefix "Hello World"
So now the quoted arguments stay quoted. All of that madness is in the name of being able to simple replace 'iptables' with 'addrule' and you're good to go. No extra formatting changes necessary.

One last thing I did was to make sure iptables-restore didn't reject my file, and if it did, to tell me:

if iptables-restore -t $rulefile ; then
  echo "iptables restore test successful, applying rules..."
  iptables-restore -v $rulefile
  rm $rulefile
else
  echo "iptables test failed. Rule file:" >&2
  echo "---" >&2
  cat $rulefile >&2
  rm $rulefile
  exit 1
fi
Throw this script into puppet and we've got automated firewall rule management that won't accidentally drop traffic on rule changes.

2 responses to 'Making iptables changes atomically and not dropping packets.'

Showing last 2 comments... (Click here to view all comments)

Adam wrote at Wed Mar 10 08:12:30 2010...
I typically leave my default chain policy to ACCEPT, then make the last line a DROP. This way when I flush the ruleset I don't lock myself out.

iptables -P INPUT DROP
iptables -A INPUT -s trusted.com -j ACCEPT
time passes
iptables -F INPUT
WHOOPS.

vs.

iptables -P INPUT ACCEPT
iptables -A INPUT -s trusted.com -j ACCEPT
iptables -A INPUT -j DROP
time passes
iptables -F INPUT
WHEW.

In my environment I'm usually editing /etc/sysconfig/iptables instead of issuing the commands directly. You can then use either the service script or iptables-restore(8) to load the file.

I admire your ingenuity with the ping-payload method to flush your ruleset, but I think I'll be sticking to at(1). It feels more fool-proof :) Plus more often than not our network administrators drop ICMP on the switches, so it wouldn't work in my environment anyway.

Jordan Sissel wrote at Thu Mar 11 15:36:22 2010...
Agreed there.

Though, I don't push anything in production that isn't done with automation, so the ngrep/ping hack is a better option than an at(8) job. The ngrep/ping hack is getting pushed as a daemontools service to all my servers while I iterate on the iptables rules.


Leave a reply

You need javascript enabled to use this form. Anti-spam efforts ongoing. Also, if the comment doesn't show up, it's because the form expired. Go back and copy your comment, reload the form, and resubmit. Apologies if this is a hassle, I'm just playing with antispam methods right now. If this insists on not working, please email me about it.

Name (required)
E-mail (optional, if you want me to be able to email you back)
URL (also optional)
Comment: