lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Yjnrz7vL9HqE5UBz@shredder>
Date:   Tue, 22 Mar 2022 17:31:27 +0200
From:   Ido Schimmel <idosch@...sch.org>
To:     David Ahern <dsahern@...nel.org>
Cc:     netdev@...r.kernel.org, kuba@...nel.org, davem@...emloft.net,
        greearb@...delatech.com
Subject: Re: [PATCH net-next] net: Add l3mdev index to flow struct and avoid
 oif reset for port devices

On Tue, Mar 22, 2022 at 08:26:48AM -0600, David Ahern wrote:
> On 3/22/22 3:22 AM, Ido Schimmel wrote:
> > On Mon, Mar 14, 2022 at 02:45:51PM -0600, David Ahern wrote:
> >> diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
> >> index 2af2b99e0bea..fb0e49c36c2e 100644
> >> --- a/net/ipv4/fib_trie.c
> >> +++ b/net/ipv4/fib_trie.c
> >> @@ -1429,11 +1429,8 @@ bool fib_lookup_good_nhc(const struct fib_nh_common *nhc, int fib_flags,
> >>  	    !(fib_flags & FIB_LOOKUP_IGNORE_LINKSTATE))
> >>  		return false;
> >>  
> >> -	if (!(flp->flowi4_flags & FLOWI_FLAG_SKIP_NH_OIF)) {
> >> -		if (flp->flowi4_oif &&
> >> -		    flp->flowi4_oif != nhc->nhc_oif)
> >> -			return false;
> >> -	}
> >> +	if (flp->flowi4_oif && flp->flowi4_oif != nhc->nhc_oif)
> >> +		return false;
> > 
> > David, we have several test cases that are failing which I have tracked
> > down to this patch.
> > 
> > Before the patch, if the original output interface was enslaved to a
> > VRF, the output interface in the flow struct would be updated to the VRF
> > and the 'FLOWI_FLAG_SKIP_NH_OIF' flag would be set, causing the above
> > check to be skipped.
> > 
> > After the patch, the check is no longer skipped, as original output
> > interface is retained and the flag was removed.
> > 
> > This breaks scenarios where a GRE tunnel specifies a dummy device
> > enslaved to a VRF as its physical device. The purpose of this
> > configuration is to redirect the underlay lookup to the table associated
> > with the VRF to which the dummy device is enslaved to. The check fails
> > because 'flp->flowi4_oif' points to the dummy device, whereas
> > 'nhc->nhc_oif' points to the interface via which the encapsulated packet
> > should egress.
> > 
> > Skipping the check when an l3mdev was set seems to solve the problem:
> > 
> > diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
> > index fb0e49c36c2e..cf1164e05d92 100644
> > --- a/net/ipv4/fib_trie.c
> > +++ b/net/ipv4/fib_trie.c
> > @@ -1429,7 +1429,8 @@ bool fib_lookup_good_nhc(const struct fib_nh_common *nhc, int fib_flags,
> >             !(fib_flags & FIB_LOOKUP_IGNORE_LINKSTATE))
> >                 return false;
> >  
> > -       if (flp->flowi4_oif && flp->flowi4_oif != nhc->nhc_oif)
> > +       if (!flp->flowi4_l3mdev &&
> > +           flp->flowi4_oif && flp->flowi4_oif != nhc->nhc_oif)
> >                 return false;
> >  
> >         return true;
> > 
> > AFAICT, this scenario does not break with ip6gre/ip6gretap tunnels
> > because 'RT6_LOOKUP_F_IFACE' is not set in
> > ip6_route_output_flags_noref() in this case.
> > 
> > WDYT? I plan to test this patch in our regression, but I'm not sure if I
> > missed other cases that might remain broken.
> 
> one of the requests with VRF has been to bind a socket to a port device
> and expect the lookup to enforce use of that egress port (e.g.,
> multipath). Switching the oif to the VRF device and then ignoring the
> oif check was making that check too flexible for that use case.

I see

> 
> What's the callchain for this failure? Perhaps the
> FLOWI_FLAG_SKIP_NH_OIF needs to be kept for this particular use case.

This is the stack trace for the failure:

    fib_lookup_good_nhc+5
    fib_table_lookup+3281
    fib4_rule_action+501
    fib_rules_lookup+858
    __fib_lookup+233
    fib_lookup.constprop.0+926
    ip_route_output_key_hash_rcu+3707
    ip_route_output_key_hash+392
    ip_route_output_flow+33
    ip_tunnel_xmit+1794
    gre_tap_xmit+1312
    dev_hard_start_xmit+448
    sch_direct_xmit+615
    __dev_queue_xmit+4841

The GRE tap is using a dummy device enslaved to a VRF as its physical
device.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ