lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 06 Jul 2022 20:49:27 +0200
From:   Jan Luebbe <>
To:     Mike Manning <>,
        David Ahern <>,
        Robert Shearman <>,
        Andy Roulin <>
Cc:     "David S. Miller" <>,,,
        linux-kernel <>
Subject: Re: [REGRESSION] connection timeout with routes to VRF

On Sun, 2022-06-26 at 21:06 +0100, Mike Manning wrote:
> Andy Roulin suggested the same fix to the same problem a few weeks back.
> Let's do it along with a test case in which covers all of
> these vrf permutations.
Reverting 3c82a21f4320 would remove isolation between the default and other VRFs
needed when no VRF route leaking has been configured between these: there may be
unintended leaking of packets arriving on a device enslaved to an l3mdev due to
the potential match on an unbound socket.

Thanks for the explanation.

VRF route leaking requires routes to be present for both ingress and egress
the testcase shown only has a route from default to red VRF. The implicit return
path from red to default VRF due to match on unbound socket is no longer

If there is a better configuration that makes this work in the general case
without a change to the kernel, we'd be happy as well.

In our full setup, the outbound TCP connection (from the default VRF) gets a
local IP from the interface enslaved to the VRF. Before 3c82a21f4320, this would
simply work.

How would the return path route from the red VRF to the default VRF look in that

Match on unbound socket in all VRFs and not only in the default VRF should be
possible by setting this option (see

Do you mean unbound as in listening socket not bound to an IP with bind()? Or as
in a socket in the default VRF?

sysctl net.ipv4.tcp_l3mdev_accept=1

The sysctl docs sound like this should only apply to listening sockets. In this
case, we have an unconnected outbound socket.

However, for this to work a change similar to the following is needed (I have
shown the change to the macro for consistency with above, it is now an inline

I can also test on master and only used the macro form only because I wasn't
completely sure how to translate it to the inline function form.

 include/net/inet_hashtables.h |   10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

--- a/include/net/inet_hashtables.h
+++ b/include/net/inet_hashtables.h
@@ -300,9 +300,8 @@
 #define INET_MATCH(__sk, __net, __cookie, __saddr, __daddr, __ports, __dif,
__sdif) \
        (((__sk)->sk_portpair == (__ports))                     &&      \
         ((__sk)->sk_addrpair == (__cookie))                    &&      \
-        (((__sk)->sk_bound_dev_if == (__dif))                  ||      \
-         ((__sk)->sk_bound_dev_if == (__sdif)))                &&      \
-        net_eq(sock_net(__sk), (__net)))
+        net_eq(sock_net(__sk), (__net))                        &&      \
+        inet_sk_bound_dev_eq((__net), (__sk)->sk_bound_dev_if, (__dif),
 #else /* 32-bit arch */
 #define INET_ADDR_COOKIE(__name, __saddr, __daddr) \
        const int __name __deprecated __attribute__((unused))
@@ -311,9 +310,8 @@
        (((__sk)->sk_portpair == (__ports))             &&              \
         ((__sk)->sk_daddr      == (__saddr))           &&              \
         ((__sk)->sk_rcv_saddr  == (__daddr))           &&              \
-        (((__sk)->sk_bound_dev_if == (__dif))          ||              \
-         ((__sk)->sk_bound_dev_if == (__sdif)))        &&              \
-        net_eq(sock_net(__sk), (__net)))
+        net_eq(sock_net(__sk), (__net))                &&              \
+        inet_sk_bound_dev_eq((__net), (__sk)->sk_bound_dev_if, (__dif),
 #endif /* 64-bit arch */

 /* Sockets in TCP_CLOSE state are _always_ taken out of the hash, so we need

I can confirm that this gets my testcase working with 

This is to get the testcase to pass, I will leave it to others to comment on
the testcase validity in terms of testing forwarding using commands on 1 device.

So a network-namespace-based testcase would be preferred? We used the simple
setup because it seemed easier to understand.

The series that 3c82a21f4320 is part of were introduced into the kernel in 2018
by the Vyatta team, who regularly run an extensive test suite for routing
for VRF functionality incl. all combinations of route leaking between default
other VRFs, so there is no known issue in this regard. I will attempt to reach
to them so as to advise them of this thread.

Are these testcases public? Perhaps I could use them find a better configuration
that handles our use-case.



Powered by blists - more mailing lists