lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aXsoxRXloFrvmOEL@hyeyoo>
Date: Thu, 29 Jan 2026 18:30:45 +0900
From: Harry Yoo <harry.yoo@...cle.com>
To: Hao Li <hao.li@...ux.dev>
Cc: Vlastimil Babka <vbabka@...e.cz>, Mateusz Guzik <mjguzik@...il.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Christoph Lameter <cl@...two.org>,
        David Rientjes <rientjes@...gle.com>,
        Roman Gushchin <roman.gushchin@...ux.dev>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org,
        kernel test robot <oliver.sang@...el.com>
Subject: Re: [PATCH] slub: avoid list_lock contention from
 __refill_objects_any()

On Thu, Jan 29, 2026 at 05:21:21PM +0800, Hao Li wrote:
> On Thu, Jan 29, 2026 at 10:07:57AM +0100, Vlastimil Babka wrote:
> > Kernel test robot has reported a regression in the patch "slab: refill
> > sheaves from all nodes". When taken in isolation like this, there is
> > indeed a tradeoff - we prefer to use remote objects prior to allocating
> > new local slabs. It is replicating a behavior that existed before
> > sheaves for replenishing cpu (partial) slabs - now called
> > get_from_any_partial() to allocate a single object.
> > 
> > So the possibility of allocating remote objects is intended even if
> > remote accesses are then slower. But the profiles in the report also
> > suggested a contention on the list_lock spinlock. And that's something
> > we can try to avoid without much tradeoff - if someone else has the
> > spin_lock, it's more likely they are allocating from the node than
> > freeing to it, so we can skip it even if it means allocating a new local
> > slab - contributing to that lock's contention isn't worth it. It should
> > not result in partial slabs accumulating on the remote node.
> > 
> > Thus add an allow_spin parameter to __refill_objects_node() and
> > get_partial_node_bulk() to make the attempts from __refill_objects_any()
> > use only a trylock.
> > 
> > Reported-by: kernel test robot <oliver.sang@...el.com>
> > Link: https://lore.kernel.org/oe-lkp/202601132136.77efd6d7-lkp@intel.com
> > Signed-off-by: Vlastimil Babka <vbabka@...e.cz>
> 
> In my testing, this patch improved performance by:
> 
> will-it-scale.64.processes +14.2%
> will-it-scale.128.processes +9.6%
> will-it-scale.192.processes +10.8%
> will-it-scale.per_process_ops +11.6%
>
> Tested-by: Hao Li <hao.li@...ux.dev>

I wonder if using spin_is_contended() or spin_is_locked()
would be better than trylock by avoiding an atomic operation?

-- 
Cheers,
Harry / Hyeonggon

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ