[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z7NKYMY7fJT5cYWu@shredder>
Date: Mon, 17 Feb 2025 16:40:32 +0200
From: Ido Schimmel <idosch@...sch.org>
To: Justin Iurman <justin.iurman@...ege.be>
Cc: netdev@...r.kernel.org, davem@...emloft.net, dsahern@...nel.org,
edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com,
horms@...nel.org, Alexander Aring <alex.aring@...il.com>,
David Lebrun <dlebrun@...gle.com>
Subject: Re: [PATCH net v2 2/3] net: ipv6: fix lwtunnel loops in ioam6, rpl
and seg6
On Sun, Feb 16, 2025 at 06:31:06PM +0200, Ido Schimmel wrote:
> On Thu, Feb 13, 2025 at 11:51:49PM +0100, Justin Iurman wrote:
> > On 2/13/25 14:28, Ido Schimmel wrote:
> > > On Tue, Feb 11, 2025 at 11:16:23PM +0100, Justin Iurman wrote:
> > > > When the destination is the same post-transformation, we enter a
> > > > lwtunnel loop. This is true for ioam6_iptunnel, rpl_iptunnel, and
> > > > seg6_iptunnel, in both input() and output() handlers respectively, where
> > > > either dst_input() or dst_output() is called at the end. It happens for
> > > > instance with the ioam6 inline mode, but can also happen for any of them
> > > > as long as the post-transformation destination still matches the fib
> > > > entry. Note that ioam6_iptunnel was already comparing the old and new
> > > > destination address to prevent the loop, but it is not enough (e.g.,
> > > > other addresses can still match the same subnet).
> > > >
> > > > Here is an example for rpl_input():
> > > >
> > > > dump_stack_lvl+0x60/0x80
> > > > rpl_input+0x9d/0x320
> > > > lwtunnel_input+0x64/0xa0
> > > > lwtunnel_input+0x64/0xa0
> > > > lwtunnel_input+0x64/0xa0
> > > > lwtunnel_input+0x64/0xa0
> > > > lwtunnel_input+0x64/0xa0
> > > > [...]
> > > > lwtunnel_input+0x64/0xa0
> > > > lwtunnel_input+0x64/0xa0
> > > > lwtunnel_input+0x64/0xa0
> > > > lwtunnel_input+0x64/0xa0
> > > > lwtunnel_input+0x64/0xa0
> > > > ip6_sublist_rcv_finish+0x85/0x90
> > > > ip6_sublist_rcv+0x236/0x2f0
> > > >
> > > > ... until rpl_do_srh() fails, which means skb_cow_head() failed.
> > > >
> > > > This patch prevents that kind of loop by redirecting to the origin
> > > > input() or output() when the destination is the same
> > > > post-transformation.
> > >
> > > A loop was reported a few months ago with a similar stack trace:
> > > https://lore.kernel.org/netdev/2bc9e2079e864a9290561894d2a602d6@akamai.com/
> > >
> > > But even with this series applied my VM gets stuck. Can you please check
> > > if the fix is incomplete?
> >
> > Good catch! Indeed, seg6_local also needs to be fixed the same way.
> >
> > Back to my first idea: maybe we could directly fix it in lwtunnel_input()
> > and lwtunnel_output() to make our lives easier, but we'd have to be careful
> > to modify all users accordingly. The users I'm 100% sure that are concerned:
> > ioam6 (output), rpl (input/output), seg6 (input/output), seg6_local (input).
> > Other users I'm not totally sure (to be checked): ila (output), bpf (input).
> >
> > Otherwise, we'll need to apply the fix to each user concerned (probably the
> > safest (best?) option right now). Any opinions?
>
> I audited the various lwt users and I agree with your analysis about
> which users seem to be effected by this issue.
>
> I'm not entirely sure how you want to fix this in
> lwtunnel_{input,output}() given that only the input()/output() handlers
> of the individual lwt users are aware of both the old and new dst
> entries.
>
> BTW, I noticed that bpf implements the xmit() hook in addition to
> input()/output(). I wonder if a loop is possible in the following case:
>
> ip_finish_output2() <----+
> lwtunnel_xmit() |
> bpf_xmit() |
> // bpf program does not change |
> // the packet and returns |
> // BPF_LWT_REROUTE |
> bpf_lwt_xmit_reroute() |
> // unmodified packet resolves |
> // the same dst entry |
> dst_output() |
> ip_output() -------------+
FWIW, verified that this is indeed the case. Reproducer:
$ cat lwt_xmit_repo.bpf.c
// SPDX-License-Identifier: GPL-2.0
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
SEC("lwt_xmit")
int repo(struct __sk_buff *skb)
{
return BPF_LWT_REROUTE;
}
$ clang -O2 -target bpf -c lwt_xmit_repo.bpf.c -o lwt_xmit_repo.o
# ip link add name dummy1 up type dummy
# ip route add 192.0.2.0/24 nexthop encap bpf xmit obj ./lwt_xmit_repo.o sec lwt_xmit dev dummy1
# ping 192.0.2.1
Powered by blists - more mailing lists