netdev - Re: [PATCH net 3/4] inet: frags: flush pending skbs in fqdir_pre

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANn89iJgoxOjGjhBAHeaCdcd3X9wzRoUg27e3TSY4X+SR0aBdQ@mail.gmail.com>
Date: Mon, 8 Dec 2025 07:17:33 -0800
From: Eric Dumazet <edumazet@...gle.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: davem@...emloft.net, netdev@...r.kernel.org, pabeni@...hat.com, 
	andrew+netdev@...n.ch, horms@...nel.org, pablo@...filter.org, fw@...len.de, 
	netfilter-devel@...r.kernel.org, willemdebruijn.kernel@...il.com, 
	kuniyu@...gle.com
Subject: Re: [PATCH net 3/4] inet: frags: flush pending skbs in fqdir_pre_exit()

On Sat, Dec 6, 2025 at 5:10 PM Jakub Kicinski <kuba@...nel.org> wrote:
>
> We have been seeing occasional deadlocks on pernet_ops_rwsem since
> September in NIPA. The stuck task was usually modprobe (often loading
> a driver like ipvlan), trying to take the lock as a Writer.
> lockdep does not track readers for rwsems so the read wasn't obvious
> from the reports.
>
> On closer inspection the Reader holding the lock was conntrack looping
> forever in nf_conntrack_cleanup_net_list(). Based on past experience
> with occasional NIPA crashes I looked thru the tests which run before
> the crash and noticed that the crash follows ip_defrag.sh. An immediate
> red flag. Scouring thru (de)fragmentation queues reveals skbs sitting
> around, holding conntrack references.
>
> The problem is that since conntrack depends on nf_defrag_ipv6,
> nf_defrag_ipv6 will load first. Since nf_defrag_ipv6 loads first its
> netns exit hooks run _after_ conntrack's netns exit hook.
>
> Flush all fragment queue SKBs during fqdir_pre_exit() to release
> conntrack references before conntrack cleanup runs. Also flush
> the queues in timer expiry handlers when they discover fqdir->dead
> is set, in case packet sneaks in while we're running the pre_exit
> flush.
>
> The commit under Fixes is not exactly the culprit, but I think
> previously the timer firing would eventually unblock the spinning
> conntrack.
>
> Fixes: d5dd88794a13 ("inet: fix various use-after-free in defrags units")
> Signed-off-by: Jakub Kicinski <kuba@...nel.org>

Reviewed-by: Eric Dumazet <edumazet@...gle.com>