lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 03 Jan 2011 18:52:59 +0100
From:	"Sebastian J. Bronner" <sebastian.bronner@....de>
To:	netdev@...r.kernel.org
CC:	Daniel Kraft <daniel.kraft@....de>
Subject: bridge not routing packets via source bridgeport

Hi all,

we recently upgraded from 2.6.32.25 to 2.6.35.24 and discovered that our
virtual machines can no longer access their own external IP addresses.
Testing revealed that 2.6.34 was the last version not to have the
problem. 2.6.36 still had it. But on to the details.

Our setup:

We use KVM to virtualise our guests. The physical machines (nodes) act
as One-to-One NAT routers to the virtual machines. The virtual machines
are connected via virtio interfaces in a bridge.

Since the virtual machines only know about their RFC-1918 addresses, any
request they make to their NATed global addresses requires a trip
through the node's netfilter to perform the needed SNAT and DNAT operations.

Take the following setup:

   {internet}
       |
     (eth0)       <- 1.1.1.254, proxy_arp=1
       |
     [node]       <- ip_forward=1, routes*, nat**
       |
    (virbr1)      <- 10.0.0.1
    /      \
(vnet0)     |
   |     (vnet1)
(veth0)     |     <- 10.0.0.2
   |     (veth0)  <- 10.0.0.3
 [vm1]      |
          [vm2]

* The static routes on the node for the vms mentioned above are as follows:
# ip r
1.1.1.2 dev virbr1 scope link
1.1.1.3 dev virbr1 scope link

** The NAT rules are set up as follows (in reality, they're a bit more
complicated - but this suffices to illustrate the problem at hand):
# iptables-save -t nat
-A PREROUTING -d 1.1.1.2 -j DNAT --to-destination 10.0.0.2
-A PREROUTING -d 1.1.1.3 -j DNAT --to-destination 10.0.0.3
-A POSTROUTING -s 10.0.0.2 -j SNAT --to-source 1.1.1.2
-A POSTROUTING -s 10.0.0.3 -j SNAT --to-source 1.1.1.3

This means that 1.1.1.2 maps to 10.0.0.2 (vm1) and
                1.1.1.3 maps to 10.0.0.3 (vm2).

Assuming ssh is running on both vms, running 'nc -v 1.1.1.3 22' from vm1
gets me ssh's introductory message.

Assuming, no service is running on port 23, running 'nc -v 1.1.1.3 23'
from vm1 gets me 'Connection refused'.

That's all fine and exactly as it should be. The vms are accessible from
the internet as well, and can access the internet.

If, however, i run 'nc -v 1.1.1.2 22' from vm1 (or any port for that
matter), I get a timeout!

Running tcpdump on all the involved interfaces showed me that the
packets successfully traverse veth0 and vnet0 and appear to get lost
upon reaching virbr1.

So, then I decided to set up a packet trace with iptables:
[on the node]
# modprobe ipt_LOG
# iptables -t raw -A PREROUTING -p tcp --dport 4577 -j TRACE
# tail -f /var/log/messages | grep TRACE
[on vm1]
# nc -v 1.1.1.2 4577

The results were very interesting, if somewhat dumbfounding. They are
attached for easier perusal. The gist of it is that the packet in
question disappears without a trace after going through the DNAT rule in
the PREROUTING chain of the NAT table. This can be seen happening three
times in vm1-to-1.1.1.2.txt in three and six second intervals (retries).

For comparison, I have also included a trace of a successful packet
traversal that ends in a 'Connection refused'. It is in vm1-to-1.1.1.3.txt.

As a last note, I should add that the problem isn't related to the IP
address. I eliminated that by putting two RFC-1918 IPs on vm1 and
mapping two IPs to it, then running nc on one IP, while the other one
was being used as the source IP.

The problem appears to be that packets can't be routed out the same
bridgeport that they arrived from.

I hope this all makes sense and that you can reproduce the problem. One
virtual machine will suffise to see the problem at work.

Feel free to contact me if you need more information or have suggestions
for me.

Cheers,
Sebastian Bronner

P.S.: The IP addresses are faked. I used vim to replace all instances of
the real IPs with the fake ones used in this e-mail consistently.
-- 
*Sebastian J. Bronner*
Administrator

D9T GmbH - Magirusstr. 39/1 - D-89077 Ulm
Tel: +49 731 1411 696-0 - Fax: +49 731 3799-220

Geschäftsführer: Daniel Kraft
Sitz und Register: Ulm, HRB 722416
Ust.IdNr: DE 260484638

http://d9t.de - D9T High Performance Hosting
info@....de

View attachment "vm1-to-1.1.1.2.txt" of type "text/plain" (3243 bytes)

View attachment "vm1-to-1.1.1.3.txt" of type "text/plain" (2284 bytes)

View attachment "iptables-nat.txt" of type "text/plain" (646 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ