Arnt Gulbrandsen
About meAbout this blog
2011-11-22

A Mikrotik IPsec policy bug

The short version: Mikrotik RouterOS doesn't support multiple, redundantly configured IPsec links. Amazon's cloud services use just that. Pain ensues. I haven't found any workaround I really like.

The long version: Amazon tells the Mikrotik Packets to 10.12.0.0/16 should be sent to 169.254.255.1, and packets from that network will be sent to you from 169.254.255.1 and also Packets to 10.12.0.0/16 should be sent to 169.254.255.5, etc. Two tunnels, different endpoints.

The Mikrotik detects a conflict between the two rules, and disables one rule. ip ipsec policy print detail shows one I, that's the disabled rule. Any traffic matching that rule will be lost. Each tunnel also carries traffic to the router at the other end, so DPD and monitoring will probably think the tunnel is up and all is well, but still, traffic matching the I rule will not be delivered.

Since I like to see green blinkenlights in my monitoring and the AWS console really wants to use both tunnels I tried to find a workaround that pleases Amazon. I tried using route filters and BGP path stuffing to give Amazon the routes it wants, while avoiding actually using the routes that depend on the frowned-upon policy. No luck. The only way seems to be to disable one BGP peer and/or one policy by hand, and let the AWS console show yellow instead of green. Later I may try to set up a second tunnel to another Mikrotik router for redundancy. But not right now. I want to write code.

If you want to ask Mikrotik about it, send mail to support@ and ask to be notified when the problem is resolved. Mention ticket 2011091666000524 so they'll know which problem it is.

Update: Mate Lang has written a script to translate AWS' generic instructions to Mikrotik commands.

The point of what AWS is doing is to express packets may be sent using either tunnel A or tunnel B, such that when one tunnel is down due to key renegotiation (which takes about one second and runs once per hour) or because AWS' router is down, then the other tunnel is used. Mate's fine work does still not allow Mikrotik routers to fail over and use both tunnels in the way AWS intends.

2011-06-10

Clueless in the cloud

What Amazon wrote:

We have noticed that one or more of your instances are running on a host degraded due to hardware failure. [...]

The host needs to undergo maintenance and will be taken down [...]

What Amazon might have written:

Thought you were clever, eh? Running that fancy Cassandra cluster? I bet you didn't expect your redundant copies on several Cassandra nodes to really be stored on the same crummy drive. (more…)