lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iL-YVSkZyQ6OK4TqQ9w0EEQoCvNFWJbAaNreK3LLFBhcA@mail.gmail.com>
Date: Sat, 3 Jan 2026 10:55:49 +0100
From: Eric Dumazet <edumazet@...gle.com>
To: Florian Westphal <fw@...len.de>
Cc: netdev@...r.kernel.org, Paolo Abeni <pabeni@...hat.com>, 
	"David S. Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>, dsahern@...nel.org, 
	syzbot+4393c47753b7808dac7d@...kaller.appspotmail.com
Subject: Re: [PATCH net] inet: frags: drop fraglist conntrack references

On Fri, Jan 2, 2026 at 3:00 PM Florian Westphal <fw@...len.de> wrote:
>
> Jakub added a warning in nf_conntrack_cleanup_net_list() to make debugging
> leaked skbs/conntrack references more obvious.
>
> syzbot reports this as triggering, and I can also reproduce this via
> ip_defrag.sh selftest:
>
>  conntrack cleanup blocked for 60s
>  WARNING: net/netfilter/nf_conntrack_core.c:2512
>  [..]
>
> conntrack clenups gets stuck because there are skbs with still hold nf_conn
> references via their frag_list.
>
>    net.core.skb_defer_max=0 makes the hang disappear.
>
> Eric Dumazet points out that skb_release_head_state() doesn't follow the
> fraglist.
>
> ip_defrag.sh can only reproduce this problem since
> commit 6471658dc66c ("udp: use skb_attempt_defer_free()"), but AFAICS this
> problem could happen with TCP as well if pmtu discovery is off.
>
> The relevant problem path for udp is:
> 1. netns emits fragmented packets
> 2. nf_defrag_v6_hook reassembles them (in output hook)
> 3. reassembled skb is tracked (skb owns nf_conn reference)
> 4. ip6_output refragments
> 5. refragmented packets also own nf_conn reference (ip6_fragment
>    calls ip6_copy_metadata())
> 6. on input path, nf_defrag_v6_hook skips defragmentation: the
>    fragments already have skb->nf_conn attached
> 7. skbs are reassembled via ipv6_frag_rcv()
> 8. skb_consume_udp -> skb_attempt_defer_free() -> skb ends up
>    in pcpu freelist, but still has nf_conn reference.
>
> Possible solutions:
>  1 let defrag engine drop nf_conn entry, OR
>  2 export kick_defer_list_purge() and call it from the conntrack
>    netns exit callback, OR
>  3 add skb_has_frag_list() check to skb_attempt_defer_free()
>
> 2 & 3 also solve ip_defrag.sh hang but share same drawback:
>
> Such reassembled skbs, queued to socket, can prevent conntrack module
> removal until userspace has consumed the packet. While both tcp and udp
> stack do call nf_reset_ct() before placing skb on socket queue, that
> function doesn't iterate frag_list skbs.
>
> Therefore drop nf_conn entries when they are placed in defrag queue.
> Keep the nf_conn entry of the first (offset 0) skb so that reassembled
> skb retains nf_conn entry for sake of TX path.
>
> Note that fixes tag is incorrect; it points to the commit introducing the
> 'ip_defrag.sh reproducible problem': no need to backport this patch to
> every stable kernel.
>
> Reported-by: syzbot+4393c47753b7808dac7d@...kaller.appspotmail.com
> Closes: https://lore.kernel.org/netdev/693b0fa7.050a0220.4004e.040d.GAE@google.com/
> Fixes: 6471658dc66c ("udp: use skb_attempt_defer_free()")
> Signed-off-by: Florian Westphal <fw@...len.de>
> ---

Thanks a lot Florian for taking care of this.

Reviewed-by: Eric Dumazet <edumazet@...gle.com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ