lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7f0076f0-6dc5-44e1-8036-c066616cc59b@suse.cz>
Date: Tue, 8 Oct 2024 10:14:10 +0200
From: Vlastimil Babka <vbabka@...e.cz>
To: Suren Baghdasaryan <surenb@...gle.com>,
 Andrew Morton <akpm@...ux-foundation.org>
Cc: Florian Westphal <fw@...len.de>, linux-kernel@...r.kernel.org,
 Uladzislau Rezki <urezki@...il.com>,
 Kent Overstreet <kent.overstreet@...ux.dev>,
 Ben Greear <greearb@...delatech.com>
Subject: Re: [PATCH lib] lib: alloc_tag_module_unload must wait for pending
 kfree_rcu calls

On 10/8/24 03:49, Suren Baghdasaryan wrote:
> On Mon, Oct 7, 2024 at 6:15 PM Andrew Morton <akpm@...ux-foundation.org> wrote:
>>
>> On Mon,  7 Oct 2024 22:52:24 +0200 Florian Westphal <fw@...len.de> wrote:
>>
>> > Ben Greear reports following splat:
>> >  ------------[ cut here ]------------
>> >  net/netfilter/nf_nat_core.c:1114 module nf_nat func:nf_nat_register_fn has 256 allocated at module unload
>> >  WARNING: CPU: 1 PID: 10421 at lib/alloc_tag.c:168 alloc_tag_module_unload+0x22b/0x3f0
>> >  Modules linked in: nf_nat(-) btrfs ufs qnx4 hfsplus hfs minix vfat msdos fat
>> > ...
>> >  Hardware name: Default string Default string/SKYBAY, BIOS 5.12 08/04/2020
>> >  RIP: 0010:alloc_tag_module_unload+0x22b/0x3f0
>> >   codetag_unload_module+0x19b/0x2a0
>> >   ? codetag_load_module+0x80/0x80
>> >
>> > nf_nat module exit calls kfree_rcu on those addresses, but the free
>> > operation is likely still pending by the time alloc_tag checks for leaks.
>> >
>> > Wait for outstanding kfree_rcu operations to complete before checking
>> > resolves this warning.
>> >
>> > Reproducer:
>> > unshare -n iptables-nft -t nat -A PREROUTING -p tcp
>> > grep nf_nat /proc/allocinfo # will list 4 allocations
>> > rmmod nft_chain_nat
>> > rmmod nf_nat                # will WARN.
>> >
>> > ...
>> >
>> > --- a/lib/codetag.c
>> > +++ b/lib/codetag.c
>> > @@ -228,6 +228,8 @@ bool codetag_unload_module(struct module *mod)
>> >       if (!mod)
>> >               return true;
>> >
>> > +     kvfree_rcu_barrier();
>> > +
>> >       mutex_lock(&codetag_lock);
>> >       list_for_each_entry(cttype, &codetag_types, link) {
>> >               struct codetag_module *found = NULL;
>>
>> It's always hard to determine why a thing like this is present, so a
>> comment is helpful:
>>
>> --- a/lib/codetag.c~lib-alloc_tag_module_unload-must-wait-for-pending-kfree_rcu-calls-fix
>> +++ a/lib/codetag.c
>> @@ -228,6 +228,7 @@ bool codetag_unload_module(struct module
>>         if (!mod)
>>                 return true;
>>
>> +       /* await any module's kfree_rcu() operations to complete */
>>         kvfree_rcu_barrier();
>>
>>         mutex_lock(&codetag_lock);
>> _
>>
>> But I do wonder whether this is in the correct place.
>>
>> Waiting for a module's ->exit() function's kfree_rcu()s to complete
>> should properly be done by the core module handling code?
> 
> I don't think core module code cares about kfree_rcu()s being complete
> before the module is unloaded.

Right, module unload should care about pending call_rcu() involving module
code and/or module-created kmem_caches that are to be destroyed, but I think
it's up to individual modules anyway to do a rcu_barrier() in those cases?

> Allocation tagging OTOH cares because it is about to destroy tags
> which will be accessed when kfree() actually happens, therefore a
> strict ordering is important.
> 
>>
>> free_module() does a full-on synchronize_rcu() prior to freeing the
>> module memory itself and I think codetag_unload_module() could be
>> called after that?
> I think we could move codetag_unload_module() after synchronize_rcu()
> inside free_module() but according to the reply in
> https://lore.kernel.org/all/20241007112904.GA27104@breakpoint.cc/
> synchronize_rcu() does not help. I'm not quite sure why...

synchronize_rcu() is only waiting for potential rcu read sections? You might
have expected rcu_barrier() to help as that's waiting for pending
call_rcu(). But as Ulad said the kfree_rcu() implementation does extra
batching so there's now a new barrier for that, originally intended for
kmem_cache_destroy(), but indeed useful here as well.

> Note that once I'm done upstreaming
> https://lore.kernel.org/all/20240902044128.664075-3-surenb@google.com/,
> this change will not be needed and I'm planning to remove this call,
> however this change is useful for backporting. It should be sent to
> stable@...r.kernel.org # v6.10+


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ