lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <217c52ba-3b06-488d-b21a-3d2dd62438a8@linux.dev>
Date:   Tue, 24 Oct 2023 10:39:37 +0800
From:   Chengming Zhou <chengming.zhou@...ux.dev>
To:     Vlastimil Babka <vbabka@...e.cz>, cl@...ux.com, penberg@...nel.org
Cc:     rientjes@...gle.com, iamjoonsoo.kim@....com,
        akpm@...ux-foundation.org, roman.gushchin@...ux.dev,
        42.hyeyoo@...il.com, willy@...radead.org, pcc@...gle.com,
        tytso@....edu, maz@...nel.org, ruansy.fnst@...itsu.com,
        vishal.moola@...il.com, lrh2000@....edu.cn, hughd@...gle.com,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        Chengming Zhou <zhouchengming@...edance.com>
Subject: Re: [RFC PATCH v2 3/6] slub: Don't freeze slabs for cpu partial

On 2023/10/24 00:00, Vlastimil Babka wrote:
> On 10/21/23 16:43, chengming.zhou@...ux.dev wrote:
>> From: Chengming Zhou <zhouchengming@...edance.com>
>>
>> Now we will freeze slabs when moving them out of node partial list to
>> cpu partial list, this method needs two cmpxchg_double operations:
>>
>> 1. freeze slab (acquire_slab()) under the node list_lock
>> 2. get_freelist() when pick used in ___slab_alloc()
>>
>> Actually we don't need to freeze when moving slabs out of node partial
>> list, we can delay freeze to use slab freelist in ___slab_alloc(), so
>> we can save one cmpxchg_double().
>>
>> And there are other good points:
>>
>> 1. The moving of slabs between node partial list and cpu partial list
>>    becomes simpler, since we don't need to freeze or unfreeze at all.
>>
>> 2. The node list_lock contention would be less, since we only need to
>>    freeze one slab under the node list_lock. (In fact, we can first
>>    move slabs out of node partial list, don't need to freeze any slab
>>    at all, so the contention on slab won't transfer to the node list_lock
>>    contention.)
>>
>> We can achieve this because there is no concurrent path would manipulate
>> the partial slab list except the __slab_free() path, which is serialized
>> now.
>>
>> Note this patch just change the parts of moving the partial slabs for
>> easy code review, we will fix other parts in the following patches.
>> Specifically this patch change three paths:
>> 1. get partial slab from node: get_partial_node()
>> 2. put partial slab to node: __unfreeze_partials()
>> 3. cache partail slab on cpu when __slab_free()
> 
> So the problem with this approach that one patch breaks things and another
> one fixes them up, is that git bisect becomes problematic, so we shouldn't
> do that and instead bite the bullet and deal with a potentially large patch.
> At some level it's unavoidable as one has to grasp all the moving pieces
> anyway to see e.g. if the changes in allocation path are compatible with the
> changes in freeing.
> When possible, we can do preparatory stuff that doesn't break things like in
> patches 1 and 2, maybe get_cpu_partial() could also be introduced (even if
> unused) before switching the world over to the new scheme in a single patch
> (and possible cleanups afterwards). So would it be possible to redo it in
> such way please?

Ok, I will change to this way. I didn't know how to handle this potentially
large patch gracefully at first. Your detailed advice is very helpful to me,
I will prepare all possible preparatory stuff, then change all parts over
in one patch.

> 
> <snip>
> 
>> @@ -2621,23 +2622,7 @@ static void __unfreeze_partials(struct kmem_cache *s, struct slab *partial_slab)
>>  			spin_lock_irqsave(&n->list_lock, flags);
>>  		}
>>  
>> -		do {
>> -
>> -			old.freelist = slab->freelist;
>> -			old.counters = slab->counters;
>> -			VM_BUG_ON(!old.frozen);
>> -
>> -			new.counters = old.counters;
>> -			new.freelist = old.freelist;
>> -
>> -			new.frozen = 0;
>> -
>> -		} while (!__slab_update_freelist(s, slab,
>> -				old.freelist, old.counters,
>> -				new.freelist, new.counters,
>> -				"unfreezing slab"));
>> -
>> -		if (unlikely(!new.inuse && n->nr_partial >= s->min_partial)) {
>> +		if (unlikely(!slab->inuse && n->nr_partial >= s->min_partial)) {
>>  			slab->next = slab_to_discard;
>>  			slab_to_discard = slab;
>>  		} else {
>> @@ -3634,18 +3619,8 @@ static void __slab_free(struct kmem_cache *s, struct slab *slab,
>>  		was_frozen = new.frozen;
>>  		new.inuse -= cnt;
>>  		if ((!new.inuse || !prior) && !was_frozen) {
>> -
>> -			if (kmem_cache_has_cpu_partial(s) && !prior) {
>> -
>> -				/*
>> -				 * Slab was on no list before and will be
>> -				 * partially empty
>> -				 * We can defer the list move and instead
>> -				 * freeze it.
>> -				 */
>> -				new.frozen = 1;
>> -
>> -			} else { /* Needs to be taken off a list */
>> +			/* Needs to be taken off a list */
>> +			if (!kmem_cache_has_cpu_partial(s) || prior) {
>>  
>>  				n = get_node(s, slab_nid(slab));
>>  				/*
>> @@ -3675,7 +3650,7 @@ static void __slab_free(struct kmem_cache *s, struct slab *slab,
>>  			 * activity can be necessary.
>>  			 */
>>  			stat(s, FREE_FROZEN);
>> -		} else if (new.frozen) {
>> +		} else if (kmem_cache_has_cpu_partial(s) && !prior) {
>>  			/*
>>  			 * If we just froze the slab then put it onto the
>>  			 * per cpu partial list.
> 
> I think this comment is now misleading, we didn't freeze the slab, so it's
> now something like "if we started with a full slab...".

Ok, will check and change these inconsistent comments by the way.

Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ