lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 23 Feb 2024 17:46:22 +0800
From: Chengming Zhou <chengming.zhou@...ux.dev>
To: Vlastimil Babka <vbabka@...e.cz>,
 "Christoph Lameter (Ampere)" <cl@...ux.com>
Cc: David Rientjes <rientjes@...gle.com>,
 Jianfeng Wang <jianfeng.w.wang@...cle.com>, penberg@...nel.org,
 iamjoonsoo.kim@....com, akpm@...ux-foundation.org, roman.gushchin@...ux.dev,
 42.hyeyoo@...il.com, linux-mm@...ck.org, linux-kernel@...r.kernel.org,
 Chengming Zhou <zhouchengming@...edance.com>
Subject: Re: [PATCH] slub: avoid scanning all partial slabs in get_slabinfo()

On 2024/2/23 17:37, Chengming Zhou wrote:
> On 2024/2/23 17:24, Vlastimil Babka wrote:
>> On 2/23/24 06:00, Chengming Zhou wrote:
>>> On 2024/2/23 11:50, Christoph Lameter (Ampere) wrote:
>>>> On Fri, 23 Feb 2024, Chengming Zhou wrote:
>>>>
>>>>>> Can we guestimate the free objects based on the number of partial slabs. That number is available.
>>>>>
>>>>> Yeah, the number of partial slabs is easy to know, but I can't think of a way to
>>>>> estimate the free objects, since __slab_free() is just double cmpxchg in most cases.
>>>>
>>>> Well a starting point may be half the objects possible in a slab page?
>>>
>>> Yeah, also a choice.
>>>
>>>>
>>>>
>>>>>> How accurate need the accounting be? We also have fuzzy accounting in the VM counters.
>>>>>
>>>>> Maybe not need to be very accurate, some delay/fuzzy should be acceptable.
>>>>>
>>>>> Another direction I think is that we don't distinguish slabs on cpu partial list or
>>>>> slabs on node partial list anymore (different with current behavior).
>>>>>
>>>>> Now we have three scopes:
>>>>> 1. SL_ALL: include all slabs
>>>>> 2. SL_PARTIAL: only include partial slabs on node
>>>>> 3. SL_CPU: only include partail slabs on cpu and the using cpu slab
>>>>>
>>>>> If we change SL_PARTIAL to mean all partial slabs, it maybe simpler.
>>>>
>>>> Thats not going to work since you would have to scan multiple lists instead of a single list.
>>>
>>> We have to use percpu counters if we go this way.
>>>
>>>>
>>>> Another approach may be to come up with some way to scan the partial lists without taking locks. That actually would improve the performance of the allocator. It may work with a single linked lists and RCU.
>>
>> We often remove a slab from the middle of a partial list due to object
>> freeing, and this means it has to be double linked, no?
> 
> Right, double linked list.
> 
>>
>>>>
>>>
>>> I think this is a better direction! We can use RCU list if slab can be freed by RCU.
>>
>> Often we remove slab from the partial list for other purposes than freeing -
>> i.e. to become a cpu (partial) slab, and that can't be handled by a rcu
>> callback nor can we wait a grace period in such situations.
> 
> IMHO, only free_slab() need to use call_rcu() to delay free the slab,
> other paths like taking partial slabs from node partial list don't need
> to wait for RCU grace period.
> 
> All we want is safely lockless iterate over the node partial list, right?
Ah, I'm wrong, these paths also need to wait for RCU grace period...


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ