lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 28 Sep 2022 16:02:43 +0200
From:   Maximilien Cuony <maximilien.cuony@...anite.ch>
To:     "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>
Cc:     netdev@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: [REGRESSION] Unable to NAT own TCP packets from another VRF with
 tcp_l3mdev_accept = 1

Hello,

We're using VRF with a machine used as a router and have a specific 
issue where the router doesn't handle his own packets correctly during 
NATing if the packet is coming from a different VRF.

We had the issue with debian buster (4.19), but the issue solved itself 
when we updated to debian bullseye (5.10.92).

However, during an upgrade of debian bullseye to the latest kernel, the 
issue appeared again (5.10.140).

We did a bisection and this leaded us to 
"b0d67ef5b43aedbb558b9def2da5b4fffeb19966 net: allow unbound socket for 
packets in VRF when tcp_l3mdev_accept set [ Upstream commit 
944fd1aeacb627fa617f85f8e5a34f7ae8ea4d8e ]".

Simplified case setup:

There is two machines in the setup. They both forward packets 
(net.ipv4.ip_forward = 1) and there is two interface between them.

The main machine has two VRF. The default VRF is using the second 
machine as the default route, on a specific interface.
The second machine has as default route to main machine, on the other 
VRF using the second pair of interfaces.

On the main machine, the second interface is in a specific VRF. In that 
VRF, packets are NATed to the internet on a third interface.

A visual schema with the normal flow is available there: 
https://etinacra.ch/kernel.png

Configuration command:

Main machine:
sysctl -w net.ipv4.tcp_l3mdev_accept = 1
sysctl -w systnet.ipv4.ip_forward = 1
iptables -t raw -A PREROUTING -i eth0 -j CT --zone 5
iptables -t raw -A OUTPUT -o eth0 -j CT --zone 5
iptables -t nat -A POSTROUTING -o eth2 -j SNAT --to 192.168.1.1
cat /etc/network/interfaces

auto firewall
iface firewall
     vrf-table 1200

auto eth0
iface eth0
     address 192.168.5.1/24
     gateway 192.168.5.2

auto eth1
iface eth1
     address 192.168.10.1/24
     vrf firewall
     up ip route add 192.168.5.0/24 via 192.168.10.2 vrf firewall

auto eth2
iface eth2
     address 192.168.1.1/24
     gateway 192.168.1.250
     vrf firewall

==

Second machine:

sysctl -w net.ipv4.ip_forward = 1

cat /etc/network/interfaces

auto eth0
iface eth0
     address 192.168.5.2/24

auto eth1
iface eth1
     address 192.168.10.2/24
     gateway 192.168.10.1

==

Without issue, if we look at a tcpdump on all interface on the main 
machine, everything is fine (output truncated):

10:28:32.811283 eth0 Out IP 192.168.5.1.55750 > 99.99.99.99.80: Flags 
[S], seq 2216112145
10:28:32.811666 eth1 In  IP 192.168.5.1.55750 > 99.99.99.99.80: Flags 
[S], seq 2216112145
10:28:32.811679 eth2 Out IP 192.168.1.1.55750 > 99.99.99.99.80: Flags 
[S], seq 2216112145
10:28:32.835138 eth2 In  IP 99.99.99.99.80 > 192.168.1.1.55750: Flags 
[S.], seq 383992840, ack 2216112146
10:28:32.835152 eth1 Out IP 99.99.99.99.80 > 192.168.5.1.55750: Flags 
[S.], seq 383992840, ack 2216112146
10:28:32.835457 eth0 In  IP 99.99.99.99.80 > 192.168.5.1.55750: Flags 
[S.], seq 383992840, ack 2216112146
10:28:32.835511 eth0 Out IP 192.168.5.1.55750 > 99.99.99.99.80: Flags 
[.], ack 1, win 502

However when the issue is present, the SYNACK does arrives on eth2, but 
is never "unNATed" back to eth1:

10:25:07.644433 eth0 Out IP 192.168.5.1.48684 > 99.99.99.99.80: Flags 
[S], seq 3207393154
10:25:07.644782 eth1 In  IP 192.168.5.1.48684 > 99.99.99.99.80: Flags 
[S], seq 3207393154
10:25:07.644793 eth2 Out IP 192.168.1.1.48684 > 99.99.99.99.80: Flags 
[S], seq 3207393154
10:25:07.668551 eth2 In  IP 54.36.61.42.80 > 192.168.1.1.48684: Flags 
[S.], seq 823335485, ack 3207393155

The issue is only with TCP connections. UDP or ICMP works fine.

Turing off net.ipv4.tcp_l3mdev_accept back to 0 also fix the issue, but 
we need this flag since we use some sockets that does not understand VRFs.

We did have a look at the diff and the code of inet_bound_dev_eq, but we 
didn't understand much the real problem - but it does seem now that 
bound_dev_if if now checked not to be False before the bound_dev_if == 
dif || bound_dev_if == sdif comparison, something that was not the case 
before (especially since it's dependent on l3mdev_accept).

Maybe our setup is wrong and we should not be able to route packets like 
that?

Thanks a lot and have a nice day!

Maximilien Cuony


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ