netdev - Re: Namespaced network devices not cleaned up properly after execution of pmtu.sh kernel selftest

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANn89iJ==5rnYa1CrtP113C_4JYQeuT6wdcJ58aa6jm-V1uqLw@mail.gmail.com>
Date: Fri, 13 Sep 2024 15:50:56 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: Mitchell Augustin <mitchell.augustin@...onical.com>
Cc: Jakub Kicinski <kuba@...nel.org>, "David S. Miller" <davem@...emloft.net>, 
	Paolo Abeni <pabeni@...hat.com>, Jiri Pirko <jiri@...nulli.us>, 
	Sebastian Andrzej Siewior <bigeasy@...utronix.de>, Lorenzo Bianconi <lorenzo@...nel.org>, 
	Daniel Borkmann <daniel@...earbox.net>, netdev@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Jacob Martin <jacob.martin@...onical.com>, dann frazier <dann.frazier@...onical.com>
Subject: Re: Namespaced network devices not cleaned up properly after
 execution of pmtu.sh kernel selftest

On Fri, Sep 13, 2024 at 3:45 PM Mitchell Augustin
<mitchell.augustin@...onical.com> wrote:
>
> Hi Jakub,
> Executing ./pmtu.sh pmtu_ipv6_ipv6_exception manually will only
> trigger the pmtu_ipv6_ipv6_exception sub-case, which only takes a
> second to run on my machines, so you shouldn't need to run the
> entirety of pmtu.sh to trigger the bug. It won't trigger on attempt
> #1, but in my experience, when I do it in that while loop, it will
> trigger in under a minute reliably.
>
> > Somewhat tangentially but if you'd be willing I wouldn't mind if you
> > were to send patches to break this test up upstream, too. It takes
> > 1h23m to run with various debug kernel options enabled. If we split
> > it into multiple smaller tests each running 10min or 20min we can
> > then spawn multiple VMs and get the results faster.
>
> This logical division of tests already exists in pmtu.sh if you pass a
> sub-test name in as the first parameter like above, but if you think
> there would be value in separating them out further or into different
> files not all in pmtu.sh, I would be happy to help with that. Just let
> me know.
>
> Regardless, I will go ahead and work on a new regression test that
> executes just our quick reproducer for this specific bug and will send
> it to this list.
>
> Thanks,
> Mitchell Augustin
>
> On Thu, Sep 12, 2024 at 9:13 PM Jakub Kicinski <kuba@...nel.org> wrote:
> >
> > On Wed, 11 Sep 2024 17:20:29 -0500 Mitchell Augustin wrote:
> > > We recently identified a bug still impacting upstream, triggered
> > > occasionally by one of the kernel selftests (net/pmtu.sh) that
> > > sometimes causes the following behavior:
> > > * One of this tests's namespaced network devices does not get properly
> > > cleaned up when the namespace is destroyed, evidenced by
> > > `unregister_netdevice: waiting for veth_A-R1 to become free. Usage
> > > count = 5` appearing in the dmesg output repeatedly
> > > * Once we start to see the above `unregister_netdevice` message, an
> > > un-cancelable hang will occur on subsequent attempts to run `modprobe
> > > ip6_vti` or `rmmod ip6_vti`
> >
> > Thanks for the report! We have seen it in our CI as well, it happens
> > maybe once a day. But as you say on x86 is quite hard to reproduce,
> > and nothing obvious stood out as a culprit.
> >
> > > However, I can easily reproduce the issue on an Nvidia Grace/Hopper
> > > machine (and other platforms with modern CPUs) with the performance
> > > governor set by doing the following:
> > > * Install/boot any affected kernel
> > > * Clone the kernel tree just to get an older version of the test cases
> > > without subtle timing changes that mask the issue (such as
> > > https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/noble/tree/?h=Ubuntu-6.8.0-39.39)
> > > * cd tools/testing/selftests/net
> > > * while true; do sudo ./pmtu.sh pmtu_ipv6_ipv6_exception; done
> >
> > That's exciting! Would you be able to try to cut down the test itself
> > (is quite long and has a ton of sub-cases). Figure out which sub-cases
> > trigger this? And maybe with an even quicker repro we'll bisect or
> > someone will correctly guess the fix?
> >
> > Somewhat tangentially but if you'd be willing I wouldn't mind if you
> > were to send patches to break this test up upstream, too. It takes
> > 1h23m to run with various debug kernel options enabled. If we split
> > it into multiple smaller tests each running 10min or 20min we can
> > then spawn multiple VMs and get the results faster.
>

Note that this issue has been discussed already with Paolo Abeni.

The problem lies in dst_cache infrastructure.