lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241007112904.GA27104@breakpoint.cc>
Date: Mon, 7 Oct 2024 13:29:04 +0200
From: Florian Westphal <fw@...len.de>
To: Suren Baghdasaryan <surenb@...gle.com>
Cc: Florian Westphal <fw@...len.de>, Ben Greear <greearb@...delatech.com>,
	netdev <netdev@...r.kernel.org>, kent.overstreet@...ux.dev,
	pablo@...filter.org
Subject: Re: nf-nat-core: allocated memory at module unload.

Suren Baghdasaryan <surenb@...gle.com> wrote:
> On Tue, Oct 1, 2024 at 12:36 PM Florian Westphal <fw@...len.de> wrote:
> >
> > Ben Greear <greearb@...delatech.com> wrote:
> >
> > [ CCing codetag folks ]
> 
> Thanks! I've been on vacation and just saw this report.
> 
> >
> > > Hello,
> > >
> > > I see this splat in 6.11.0 (plus a single patch to fix vrf xmit deadlock).
> > >
> > > Is this a known issue?  Is it a serious problem?
> >
> > Not known to me.  Looks like an mm (rcu)+codetag problem.
> >
> > > ------------[ cut here ]------------
> > > net/netfilter/nf_nat_core.c:1114 module nf_nat func:nf_nat_register_fn has 256 allocated at module unload
> > > WARNING: CPU: 1 PID: 10421 at lib/alloc_tag.c:168 alloc_tag_module_unload+0x22b/0x3f0
> > > Modules linked in: nf_nat(-) btrfs ufs qnx4 hfsplus hfs minix vfat msdos fat
> > ...
> > > Hardware name: Default string Default string/SKYBAY, BIOS 5.12 08/04/2020
> > > RIP: 0010:alloc_tag_module_unload+0x22b/0x3f0
> > >  codetag_unload_module+0x19b/0x2a0
> > >  ? codetag_load_module+0x80/0x80
> > >  ? up_write+0x4f0/0x4f0
> >
> > "Well, yes, but actually no."
> >
> > At this time, kfree_rcu() has been called on all 4 objects.
> >
> > Looks like kfree_rcu no longer cares even about rcu_barrier(), and
> > there is no kvfree_rcu_barrier() in 6.11.
> >
> > The warning goes away when I replace kfree_rcu with call_rcu+kfree
> > plus rcu_barrier in module exit path.
> >
> > But I don't think its the right thing to do.
> >
> > (referring to nf_nat_unregister_fn(), kfree_rcu(priv, rcu_head);).
> >
> > Reproducer:
> > unshare -n iptables-nft -t nat -A PREROUTING -p tcp
> > grep nf_nat /proc/allocinfo # will list 4 allocations
> > rmmod nft_chain_nat
> > rmmod nf_nat                # will WARN.
> >
> > Without rmmod, the 4 allocations go away after a few seconds,
> > grep will no longer list them and then rmmod won't splat.
> 
> I see. So, the kfree_rcu() was already called but freeing did not
> happen yet, in the meantime we are unloading the module.

Yes.

> We could add
> a synchronize_rcu() at the beginning of codetag_unload_module() so
> that all pending kfree_rcu()s complete before we check codetag
> counters:
> 
> bool codetag_unload_module(struct module *mod)
> {
>         struct codetag_type *cttype;
>         bool unload_ok = true;
> 
>         if (!mod)
>                 return true;
> 
> +      synchronize_rcu();
>         mutex_lock(&codetag_lock);

This doesn't help as kfree_rcu doesn't wait for this.

Use of kvfree_rcu_barrier() instead does work though.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ