lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <940fa370-08ce-1d39-d5cc-51de8e853b47@gmail.com>
Date:   Sun, 26 Jun 2022 21:06:50 +0100
From:   Mike Manning <mvrmanning@...il.com>
To:     David Ahern <dsahern@...nel.org>, Jan Luebbe <jluebbe@...net.de>,
        Robert Shearman <robertshearman@...il.com>,
        Andy Roulin <aroulin@...dia.com>
Cc:     "David S. Miller" <davem@...emloft.net>, netdev@...r.kernel.org,
        regressions@...ts.linux.dev,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: [REGRESSION] connection timeout with routes to VRF

On 11/06/2022 17:44, David Ahern wrote:
> On 6/11/22 5:14 AM, Jan Luebbe wrote:
>> Hi,
>>
>> TL;DR: We think we have found a regression in the handling of VRF route leaking
>> caused by "net: allow binding socket in a VRF when there's an unbound socket"
>> (3c82a21f4320).
> This is the 3rd report in the past few months about this commit.
>
> ...
>
>> Our minimized test case looks like this:
>>  ip rule add pref 32765 from all lookup local
>>  ip rule del pref 0 from all lookup local
>>  ip link add red type vrf table 1000
>>  ip link set red up
>>  ip route add vrf red unreachable default metric 8192
>>  ip addr add dev red 172.16.0.1/24
>>  ip route add 172.16.0.0/24 dev red
>>  ip vrf exec red socat -dd TCP-LISTEN:1234,reuseaddr,fork SYSTEM:"echo connected" &
>>  sleep 1
>>  nc 172.16.0.1 1234 < /dev/null
>>
> ...
> Thanks for the detailed analysis and reproducer.
>
>> The partial revert
>> diff --git a/include/net/inet_hashtables.h b/include/net/inet_hashtables.h
>> index 98e1ec1a14f0..41e7f20d7e51 100644
>> --- a/include/net/inet_hashtables.h
>> +++ b/include/net/inet_hashtables.h
>> @@ -310,8 +310,9 @@ static inline struct sock *inet_lookup_listener(struct net *net,
>>  #define INET_MATCH(__sk, __net, __cookie, __saddr, __daddr, __ports, __dif, __sdif) \
>>         (((__sk)->sk_portpair == (__ports))                     &&      \
>>          ((__sk)->sk_addrpair == (__cookie))                    &&      \
>> -        (((__sk)->sk_bound_dev_if == (__dif))                  ||      \
>> -         ((__sk)->sk_bound_dev_if == (__sdif)))                &&      \
>> +        (!(__sk)->sk_bound_dev_if      ||                              \
>> +          ((__sk)->sk_bound_dev_if == (__dif))                 ||      \
>> +          ((__sk)->sk_bound_dev_if == (__sdif)))               &&      \
>>          net_eq(sock_net(__sk), (__net)))
>>  #else /* 32-bit arch */
>>  #define INET_ADDR_COOKIE(__name, __saddr, __daddr) \
>> @@ -321,8 +322,9 @@ static inline struct sock *inet_lookup_listener(struct net *net,
>>         (((__sk)->sk_portpair == (__ports))             &&              \
>>          ((__sk)->sk_daddr      == (__saddr))           &&              \
>>          ((__sk)->sk_rcv_saddr  == (__daddr))           &&              \
>> -        (((__sk)->sk_bound_dev_if == (__dif))          ||              \
>> -         ((__sk)->sk_bound_dev_if == (__sdif)))        &&              \
>> +        (!(__sk)->sk_bound_dev_if      ||                              \
>> +          ((__sk)->sk_bound_dev_if == (__dif))         ||              \
>> +          ((__sk)->sk_bound_dev_if == (__sdif)))       &&              \
>>          net_eq(sock_net(__sk), (__net)))
>>  #endif /* 64-bit arch */
>>
>> restores the original behavior when applied on v5.18. This doesn't apply
>> directly on master, as the macro was replaced by an inline function in "inet:
>> add READ_ONCE(sk->sk_bound_dev_if) in INET_MATCH()" (4915d50e300e).
>>
>> I have to admit I don't quite understand 3c82a21f4320, so I'm not sure how to
>> proceed. What would be broken by the partial revert above? Are there better ways
>> to configure routing into the VRF than simply "ip route add 172.16.0.0/24 dev
>> red" that still work?
>>
>> Thanks,
>> Jan
>>
>> #regzbot introduced: 3c82a21f4320
>>
>>
>>
> Andy Roulin suggested the same fix to the same problem a few weeks back.
> Let's do it along with a test case in fcnl-test.sh which covers all of
> these vrf permutations.
>
Reverting 3c82a21f4320 would remove isolation between the default and other VRFs
needed when no VRF route leaking has been configured between these: there may be
unintended leaking of packets arriving on a device enslaved to an l3mdev due to
the potential match on an unbound socket.

VRF route leaking requires routes to be present for both ingress and egress VRFs,
the testcase shown only has a route from default to red VRF. The implicit return
path from red to default VRF due to match on unbound socket is no longer present.

Match on unbound socket in all VRFs and not only in the default VRF should be
possible by setting this option (see
https://www.kernel.org/doc/Documentation/networking/vrf.txt):

sysctl net.ipv4.tcp_l3mdev_accept=1

However, for this to work a change similar to the following is needed (I have
shown the change to the macro for consistency with above, it is now an inline fn):

---
 include/net/inet_hashtables.h |   10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

--- a/include/net/inet_hashtables.h
+++ b/include/net/inet_hashtables.h
@@ -300,9 +300,8 @@
 #define INET_MATCH(__sk, __net, __cookie, __saddr, __daddr, __ports, __dif, __sdif) \
        (((__sk)->sk_portpair == (__ports))                     &&      \
         ((__sk)->sk_addrpair == (__cookie))                    &&      \
-        (((__sk)->sk_bound_dev_if == (__dif))                  ||      \
-         ((__sk)->sk_bound_dev_if == (__sdif)))                &&      \
-        net_eq(sock_net(__sk), (__net)))
+        net_eq(sock_net(__sk), (__net))                        &&      \
+        inet_sk_bound_dev_eq((__net), (__sk)->sk_bound_dev_if, (__dif), (__sdif)))
 #else /* 32-bit arch */
 #define INET_ADDR_COOKIE(__name, __saddr, __daddr) \
        const int __name __deprecated __attribute__((unused))
@@ -311,9 +310,8 @@
        (((__sk)->sk_portpair == (__ports))             &&              \
         ((__sk)->sk_daddr      == (__saddr))           &&              \
         ((__sk)->sk_rcv_saddr  == (__daddr))           &&              \
-        (((__sk)->sk_bound_dev_if == (__dif))          ||              \
-         ((__sk)->sk_bound_dev_if == (__sdif)))        &&              \
-        net_eq(sock_net(__sk), (__net)))
+        net_eq(sock_net(__sk), (__net))                &&              \
+        inet_sk_bound_dev_eq((__net), (__sk)->sk_bound_dev_if, (__dif), (__sdif)))
 #endif /* 64-bit arch */

 /* Sockets in TCP_CLOSE state are _always_ taken out of the hash, so we need


This is to get the testcase to pass, I will leave it to others to comment on
the testcase validity in terms of testing forwarding using commands on 1 device.

The series that 3c82a21f4320 is part of were introduced into the kernel in 2018
by the Vyatta team, who regularly run an extensive test suite for routing protocols
for VRF functionality incl. all combinations of route leaking between default and
other VRFs, so there is no known issue in this regard. I will attempt to reach out
to them so as to advise them of this thread.

Thanks
Mike 
 





 
 



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ