lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0c8cc83bb73abf080faf584f319008b67d0931db.camel@linaro.org>
Date: Wed, 30 Jul 2025 13:20:18 +0100
From: André Draszik <andre.draszik@...aro.org>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>, 
	linux-kernel@...r.kernel.org
Cc: André Almeida <andrealmeid@...lia.com>, Darren Hart	
 <dvhart@...radead.org>, Davidlohr Bueso <dave@...olabs.net>, Ingo Molnar	
 <mingo@...hat.com>, Juri Lelli <juri.lelli@...hat.com>, Peter Zijlstra	
 <peterz@...radead.org>, Thomas Gleixner <tglx@...utronix.de>, Valentin
 Schneider <vschneid@...hat.com>, Waiman Long <longman@...hat.com>, Andrew
 Morton	 <akpm@...ux-foundation.org>, David Hildenbrand <david@...hat.com>,
 "Liam R. Howlett" <Liam.Howlett@...cle.com>, Lorenzo Stoakes
 <lorenzo.stoakes@...cle.com>, Michal Hocko	 <mhocko@...e.com>, Mike
 Rapoport <rppt@...nel.org>, Suren Baghdasaryan	 <surenb@...gle.com>,
 Vlastimil Babka <vbabka@...e.cz>, linux-mm@...ck.org
Subject: Re: [PATCH v2 2/6] futex: Use RCU-based per-CPU reference counting
 instead of rcuref_t

On Thu, 2025-07-10 at 13:00 +0200, Sebastian Andrzej Siewior wrote:
> From: Peter Zijlstra <peterz@...radead.org>
> 
> The use of rcuref_t for reference counting introduces a performance bottleneck
> when accessed concurrently by multiple threads during futex operations.
> 
> Replace rcuref_t with special crafted per-CPU reference counters. The
> lifetime logic remains the same.
> 
> The newly allocate private hash starts in FR_PERCPU state. In this state, each
> futex operation that requires the private hash uses a per-CPU counter (an
> unsigned int) for incrementing or decrementing the reference count.
> 
> When the private hash is about to be replaced, the per-CPU counters are
> migrated to a atomic_t counter mm_struct::futex_atomic.
> The migration process:
> - Waiting for one RCU grace period to ensure all users observe the
>   current private hash. This can be skipped if a grace period elapsed
>   since the private hash was assigned.
> 
> - futex_private_hash::state is set to FR_ATOMIC, forcing all users to
>   use mm_struct::futex_atomic for reference counting.
> 
> - After a RCU grace period, all users are guaranteed to be using the
>   atomic counter. The per-CPU counters can now be summed up and added to
>   the atomic_t counter. If the resulting count is zero, the hash can be
>   safely replaced. Otherwise, active users still hold a valid reference.
> 
> - Once the atomic reference count drops to zero, the next futex
>   operation will switch to the new private hash.
> 
> call_rcu_hurry() is used to speed up transition which otherwise might be
> delay with RCU_LAZY. There is nothing wrong with using call_rcu(). The
> side effects would be that on auto scaling the new hash is used later
> and the SET_SLOTS prctl() will block longer.
> 
> [bigeasy: commit description + mm get/ put_async]

kmemleak complains about a new memleak with this commit:

[  680.179004][  T101] kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak)

$ cat /sys/kernel/debug/kmemleak
unreferenced object (percpu) 0xc22ec0eface8 (size 4):
  comm "swapper/0", pid 1, jiffies 4294893115
  hex dump (first 4 bytes on cpu 7):
    01 00 00 00                                      ....
  backtrace (crc b8bc6765):
    kmemleak_alloc_percpu+0x48/0xb8
    pcpu_alloc_noprof+0x6ac/0xb68
    futex_mm_init+0x60/0xe0
    mm_init+0x1e8/0x3c0
    mm_alloc+0x5c/0x78
    init_args+0x74/0x4b0
    debug_vm_pgtable+0x60/0x2d8
    do_one_initcall+0x128/0x3e0
    do_initcall_level+0xb4/0xe8
    do_initcalls+0x60/0xb0
    do_basic_setup+0x28/0x40
    kernel_init_freeable+0x158/0x1f8
    kernel_init+0x2c/0x1e0
    ret_from_fork+0x10/0x20

And futex_mm_init+0x60/0xe0 resolves to
    mm->futex_ref = alloc_percpu(unsigned int);
in futex_mm_init().

Reverting this commit (and patches 3 and 4 in this series due to context),
makes kmemleak happy again.

Cheers,
Andre'

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ