[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <4D220CFB.5060300@d9t.de>
Date: Mon, 03 Jan 2011 18:52:59 +0100
From: "Sebastian J. Bronner" <sebastian.bronner@....de>
To: netdev@...r.kernel.org
CC: Daniel Kraft <daniel.kraft@....de>
Subject: bridge not routing packets via source bridgeport
Hi all,
we recently upgraded from 2.6.32.25 to 2.6.35.24 and discovered that our
virtual machines can no longer access their own external IP addresses.
Testing revealed that 2.6.34 was the last version not to have the
problem. 2.6.36 still had it. But on to the details.
Our setup:
We use KVM to virtualise our guests. The physical machines (nodes) act
as One-to-One NAT routers to the virtual machines. The virtual machines
are connected via virtio interfaces in a bridge.
Since the virtual machines only know about their RFC-1918 addresses, any
request they make to their NATed global addresses requires a trip
through the node's netfilter to perform the needed SNAT and DNAT operations.
Take the following setup:
{internet}
|
(eth0) <- 1.1.1.254, proxy_arp=1
|
[node] <- ip_forward=1, routes*, nat**
|
(virbr1) <- 10.0.0.1
/ \
(vnet0) |
| (vnet1)
(veth0) | <- 10.0.0.2
| (veth0) <- 10.0.0.3
[vm1] |
[vm2]
* The static routes on the node for the vms mentioned above are as follows:
# ip r
1.1.1.2 dev virbr1 scope link
1.1.1.3 dev virbr1 scope link
** The NAT rules are set up as follows (in reality, they're a bit more
complicated - but this suffices to illustrate the problem at hand):
# iptables-save -t nat
-A PREROUTING -d 1.1.1.2 -j DNAT --to-destination 10.0.0.2
-A PREROUTING -d 1.1.1.3 -j DNAT --to-destination 10.0.0.3
-A POSTROUTING -s 10.0.0.2 -j SNAT --to-source 1.1.1.2
-A POSTROUTING -s 10.0.0.3 -j SNAT --to-source 1.1.1.3
This means that 1.1.1.2 maps to 10.0.0.2 (vm1) and
1.1.1.3 maps to 10.0.0.3 (vm2).
Assuming ssh is running on both vms, running 'nc -v 1.1.1.3 22' from vm1
gets me ssh's introductory message.
Assuming, no service is running on port 23, running 'nc -v 1.1.1.3 23'
from vm1 gets me 'Connection refused'.
That's all fine and exactly as it should be. The vms are accessible from
the internet as well, and can access the internet.
If, however, i run 'nc -v 1.1.1.2 22' from vm1 (or any port for that
matter), I get a timeout!
Running tcpdump on all the involved interfaces showed me that the
packets successfully traverse veth0 and vnet0 and appear to get lost
upon reaching virbr1.
So, then I decided to set up a packet trace with iptables:
[on the node]
# modprobe ipt_LOG
# iptables -t raw -A PREROUTING -p tcp --dport 4577 -j TRACE
# tail -f /var/log/messages | grep TRACE
[on vm1]
# nc -v 1.1.1.2 4577
The results were very interesting, if somewhat dumbfounding. They are
attached for easier perusal. The gist of it is that the packet in
question disappears without a trace after going through the DNAT rule in
the PREROUTING chain of the NAT table. This can be seen happening three
times in vm1-to-1.1.1.2.txt in three and six second intervals (retries).
For comparison, I have also included a trace of a successful packet
traversal that ends in a 'Connection refused'. It is in vm1-to-1.1.1.3.txt.
As a last note, I should add that the problem isn't related to the IP
address. I eliminated that by putting two RFC-1918 IPs on vm1 and
mapping two IPs to it, then running nc on one IP, while the other one
was being used as the source IP.
The problem appears to be that packets can't be routed out the same
bridgeport that they arrived from.
I hope this all makes sense and that you can reproduce the problem. One
virtual machine will suffise to see the problem at work.
Feel free to contact me if you need more information or have suggestions
for me.
Cheers,
Sebastian Bronner
P.S.: The IP addresses are faked. I used vim to replace all instances of
the real IPs with the fake ones used in this e-mail consistently.
--
*Sebastian J. Bronner*
Administrator
D9T GmbH - Magirusstr. 39/1 - D-89077 Ulm
Tel: +49 731 1411 696-0 - Fax: +49 731 3799-220
Geschäftsführer: Daniel Kraft
Sitz und Register: Ulm, HRB 722416
Ust.IdNr: DE 260484638
http://d9t.de - D9T High Performance Hosting
info@....de
View attachment "vm1-to-1.1.1.2.txt" of type "text/plain" (3243 bytes)
View attachment "vm1-to-1.1.1.3.txt" of type "text/plain" (2284 bytes)
View attachment "iptables-nat.txt" of type "text/plain" (646 bytes)
Powered by blists - more mailing lists