lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1eeb84d4-42b1-d204-ece1-b76bfbc548bf@linux.com>
Date: Tue, 27 Feb 2024 14:55:15 -0800 (PST)
From: "Christoph Lameter (Ampere)" <cl@...ux.com>
To: Chengming Zhou <chengming.zhou@...ux.dev>
cc: Vlastimil Babka <vbabka@...e.cz>, David Rientjes <rientjes@...gle.com>, 
    Jianfeng Wang <jianfeng.w.wang@...cle.com>, penberg@...nel.org, 
    iamjoonsoo.kim@....com, akpm@...ux-foundation.org, 
    roman.gushchin@...ux.dev, 42.hyeyoo@...il.com, linux-mm@...ck.org, 
    linux-kernel@...r.kernel.org, Chengming Zhou <zhouchengming@...edance.com>
Subject: Re: [PATCH] slub: avoid scanning all partial slabs in
 get_slabinfo()

On Tue, 27 Feb 2024, Chengming Zhou wrote:

>> We could mark the state change (list ownership) in the slab metadata and then abort the scan if the state mismatches the list.
>
> It seems feasible, maybe something like below?
>
> But this way needs all kmem_caches have SLAB_TYPESAFE_BY_RCU, right?

No.

If a slab is freed to the page allocator and the fields are reused in a 
different way then we would have to wait till the end of the RCU period. 
This could be done with a deferred free. Otherwise we have the type 
checking to ensure that nothing untoward happens in the RCU period.

The usually shuffle of the pages between freelists/cpulists/cpuslab and 
fully used slabs would not require that.

> Not sure if this is acceptable? Which may cause random delay of memory free.
>
> ```
> retry:
> 	rcu_read_lock();
>
> 	h = rcu_dereference(list_next_rcu(&n->partial));
>
> 	while (h != &n->partial) {

Hmm... a linked list that forms a circle? Linked lists usually terminate 
in a NULL pointer.

So this would be


redo:

 	<zap counters>
 	rcu_read_lock();
 	h = <first>;

 	while (h && h->type == <our type>) {
 		  <count h somethings>

 		  /* Maybe check h->type again */
 		  if (h->type != <our_type>)
 			break;

 		  h = <next>;
 	}

 	rcu_read_unlock();


 	if (!h) /* Type of list changed under us */
 		goto redo;


The check for type == <our_type> is racy. Maybe we can ignore that or 
we could do something additional.

Using RCU does not make sense if you add locking in the inner loop. Then 
it gets too complicated and causes delay. This must be a simple fast 
lockless loop in order to do what we need.

Presumably the type and list pointers are in the same cacheline and thus 
could made to be updated in a coherent way if properly sequenced with 
fences etc.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ