[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220228205440.GA24680@debian.home>
Date: Mon, 28 Feb 2022 21:54:40 +0100
From: Guillaume Nault <gnault@...hat.com>
To: David Ahern <dsahern@...nel.org>
Cc: David Miller <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>, netdev@...r.kernel.org,
Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>
Subject: Re: [PATCH net] ipv4: fix route lookups when handling ICMP redirects
and PMTU updates
On Mon, Feb 28, 2022 at 10:31:58AM -0700, David Ahern wrote:
> On 2/28/22 10:16 AM, Guillaume Nault wrote:
> > Fixes: d3a25c980fc2 ("ipv4: Fix nexthop exception hash computation.")
>
> That does not seem related to tos in the flow struct at all.
Ouch, copy/paste mistake.
I meant 4895c771c7f0 ("ipv4: Add FIB nexthop exceptions."), which is
the next commit with 'git log -- net/ipv4/route.c'.
Really sorry :/, and thanks a lot for catching that!
> > diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> > index f33ad1f383b6..d5d058de3664 100644
> > --- a/net/ipv4/route.c
> > +++ b/net/ipv4/route.c
> > @@ -499,6 +499,15 @@ void __ip_select_ident(struct net *net, struct iphdr *iph, int segs)
> > }
> > EXPORT_SYMBOL(__ip_select_ident);
> >
> > +static void ip_rt_fix_tos(struct flowi4 *fl4)
>
> make this a static inline in include/net/flow.h and update
> flowi4_init_output and flowi4_update_output to use it. That should cover
> a few of the cases below leaving just ...
Hum, I didn't think about this option, but it looks risky to me. As I
put it in note 1, ip_route_output_key_hash() unconditionally sets
->flowi4_scope, assuming it can infer the scope from the RTO_ONLINK bit
of ->flowi4_tos. If we santise these fields in flowi4_init_output()
(and flowi4_update_output()), then ip_route_output_key_hash() would
sometimes work on already santised values and sometimes not. So it
wouldn't know if it should initialise ->flowi4_scope.
We could decide to let ip_route_output_key_hash() initialise
->flowi4_scope only when the RTO_ONLINK bit is set, which
guarantees that we don't have sanitised values. But before that, we'd
need to audit all other callers, to verify that they correctly
initialise the ->flowi4_scope with RT_SCOPE_UNIVERSE, since
ip_route_output_key_hash() isn't going do it for them anymore.
I'll audit all these callers, but that should be something for
net-next.
> > @@ -2613,9 +2625,7 @@ struct rtable *ip_route_output_key_hash(struct net *net, struct flowi4 *fl4,
> > struct rtable *rth;
> >
> > fl4->flowi4_iif = LOOPBACK_IFINDEX;
> > - fl4->flowi4_tos = tos & IPTOS_RT_MASK;
> > - fl4->flowi4_scope = ((tos & RTO_ONLINK) ?
> > - RT_SCOPE_LINK : RT_SCOPE_UNIVERSE);
> > + ip_rt_fix_tos(fl4);
>
> ... this one to call the new helper.
BTW, here's a bit more about the context around this patch.
I found the problem while working on removing the use of RTO_ONLINK, so
that ->flowi4_tos could be converted to dscp_t.
The objective is to modify callers so that they'd set ->flowi4_scope
directly, instead using RTO_ONLINK to mark their intention (and that's
why I said I'd have to audit them anyway).
Once that will be done, ip_rt_fix_tos() won't have to touch the scope
anymore. And once ->flowi4_tos will be converted to dscp_t, we'll can
remove that function entirely since dscp_t ensures ECN bits are cleared
(IPTOS_RT_MASK also ensures that high order bits are cleared too, but
that's redundant with the RT_TOS() calls already done by callers, and
which somewhat aren't really desirable anyway).
> >
> > rcu_read_lock();
> > rth = ip_route_output_key_hash_rcu(net, fl4, &res, skb);
>
Powered by blists - more mailing lists