lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJuCfpH_g2ousOyUe19hwUpTGsQZa=w8sK9TCvU-aUsNKDdJTw@mail.gmail.com>
Date: Wed, 9 Oct 2024 11:23:47 -0700
From: Suren Baghdasaryan <surenb@...gle.com>
To: Ben Greear <greearb@...delatech.com>
Cc: Florian Westphal <fw@...len.de>, netdev <netdev@...r.kernel.org>, kent.overstreet@...ux.dev, 
	pablo@...filter.org
Subject: Re: nf-nat-core: allocated memory at module unload.

On Wed, Oct 9, 2024 at 11:20 AM Ben Greear <greearb@...delatech.com> wrote:
>
> On 10/7/24 08:10, Suren Baghdasaryan wrote:
> > On Mon, Oct 7, 2024 at 4:29 AM Florian Westphal <fw@...len.de> wrote:
> >>
> >> Suren Baghdasaryan <surenb@...gle.com> wrote:
> >>> On Tue, Oct 1, 2024 at 12:36 PM Florian Westphal <fw@...len.de> wrote:
> >>>>
> >>>> Ben Greear <greearb@...delatech.com> wrote:
> >>>>
> >>>> [ CCing codetag folks ]
> >>>
> >>> Thanks! I've been on vacation and just saw this report.
> >>>
> >>>>
> >>>>> Hello,
> >>>>>
> >>>>> I see this splat in 6.11.0 (plus a single patch to fix vrf xmit deadlock).
> >>>>>
> >>>>> Is this a known issue?  Is it a serious problem?
> >>>>
> >>>> Not known to me.  Looks like an mm (rcu)+codetag problem.
> >>>>
> >>>>> ------------[ cut here ]------------
> >>>>> net/netfilter/nf_nat_core.c:1114 module nf_nat func:nf_nat_register_fn has 256 allocated at module unload
> >>>>> WARNING: CPU: 1 PID: 10421 at lib/alloc_tag.c:168 alloc_tag_module_unload+0x22b/0x3f0
> >>>>> Modules linked in: nf_nat(-) btrfs ufs qnx4 hfsplus hfs minix vfat msdos fat
> >>>> ...
> >>>>> Hardware name: Default string Default string/SKYBAY, BIOS 5.12 08/04/2020
> >>>>> RIP: 0010:alloc_tag_module_unload+0x22b/0x3f0
> >>>>>   codetag_unload_module+0x19b/0x2a0
> >>>>>   ? codetag_load_module+0x80/0x80
> >>>>>   ? up_write+0x4f0/0x4f0
> >>>>
> >>>> "Well, yes, but actually no."
> >>>>
> >>>> At this time, kfree_rcu() has been called on all 4 objects.
> >>>>
> >>>> Looks like kfree_rcu no longer cares even about rcu_barrier(), and
> >>>> there is no kvfree_rcu_barrier() in 6.11.
> >>>>
> >>>> The warning goes away when I replace kfree_rcu with call_rcu+kfree
> >>>> plus rcu_barrier in module exit path.
> >>>>
> >>>> But I don't think its the right thing to do.
>
> Hello,
>
> Is this approach just ugly, or plain wrong?

I think the approach is correct.

>
> kvfree_rcu_barrier does not existing in 6.10 kernel.

Yeah, I'll try backporting kvfree_rcu_barrier() to 6.10 and 6.11 for
this change.

>
> Thanks,
> Ben
>
> >>>>
> >>>> (referring to nf_nat_unregister_fn(), kfree_rcu(priv, rcu_head);).
> >>>>
> >>>> Reproducer:
> >>>> unshare -n iptables-nft -t nat -A PREROUTING -p tcp
> >>>> grep nf_nat /proc/allocinfo # will list 4 allocations
> >>>> rmmod nft_chain_nat
> >>>> rmmod nf_nat                # will WARN.
> >>>>
> >>>> Without rmmod, the 4 allocations go away after a few seconds,
> >>>> grep will no longer list them and then rmmod won't splat.
> >>>
> >>> I see. So, the kfree_rcu() was already called but freeing did not
> >>> happen yet, in the meantime we are unloading the module.
> >>
> >> Yes.
> >>
> >>> We could add
> >>> a synchronize_rcu() at the beginning of codetag_unload_module() so
> >>> that all pending kfree_rcu()s complete before we check codetag
> >>> counters:
> >>>
> >>> bool codetag_unload_module(struct module *mod)
> >>> {
> >>>          struct codetag_type *cttype;
> >>>          bool unload_ok = true;
> >>>
> >>>          if (!mod)
> >>>                  return true;
> >>>
> >>> +      synchronize_rcu();
> >>>          mutex_lock(&codetag_lock);
> >>
> >> This doesn't help as kfree_rcu doesn't wait for this.
> >>
> >> Use of kvfree_rcu_barrier() instead does work though.
> >
> > I see. That sounds like an acceptable fix. Please post it and I'll ack it.
> > Thanks!
> >
> --
> Ben Greear <greearb@...delatech.com>
> Candela Technologies Inc  http://www.candelatech.com
>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ