lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ba9919b9-2231-45b1-b6e5-7239fbc167c1@suse.cz>
Date: Wed, 12 Mar 2025 19:16:51 +0100
From: Vlastimil Babka <vbabka@...e.cz>
To: Harry Yoo <harry.yoo@...cle.com>
Cc: Suren Baghdasaryan <surenb@...gle.com>,
 "Liam R. Howlett" <Liam.Howlett@...cle.com>, Christoph Lameter
 <cl@...ux.com>, David Rientjes <rientjes@...gle.com>,
 Roman Gushchin <roman.gushchin@...ux.dev>,
 Hyeonggon Yoo <42.hyeyoo@...il.com>, Uladzislau Rezki <urezki@...il.com>,
 linux-mm@...ck.org, linux-kernel@...r.kernel.org, rcu@...r.kernel.org,
 maple-tree@...ts.infradead.org
Subject: Re: [PATCH RFC v2 06/10] slab: sheaf prefilling for guaranteed
 allocations

On 2/25/25 09:00, Harry Yoo wrote:
> On Fri, Feb 14, 2025 at 05:27:42PM +0100, Vlastimil Babka wrote:
>> Add functions for efficient guaranteed allocations e.g. in a critical
>> section that cannot sleep, when the exact number of allocations is not
>> known beforehand, but an upper limit can be calculated.
>> 
>> kmem_cache_prefill_sheaf() returns a sheaf containing at least given
>> number of objects.
>> 
>> kmem_cache_alloc_from_sheaf() will allocate an object from the sheaf
>> and is guaranteed not to fail until depleted.
>> 
>> kmem_cache_return_sheaf() is for giving the sheaf back to the slab
>> allocator after the critical section. This will also attempt to refill
>> it to cache's sheaf capacity for better efficiency of sheaves handling,
>> but it's not stricly necessary to succeed.
>> 
>> kmem_cache_refill_sheaf() can be used to refill a previously obtained
>> sheaf to requested size. If the current size is sufficient, it does
>> nothing. If the requested size exceeds cache's sheaf_capacity and the
>> sheaf's current capacity, the sheaf will be replaced with a new one,
>> hence the indirect pointer parameter.
>> 
>> kmem_cache_sheaf_size() can be used to query the current size.
>> 
>> The implementation supports requesting sizes that exceed cache's
>> sheaf_capacity, but it is not efficient - such sheaves are allocated
>> fresh in kmem_cache_prefill_sheaf() and flushed and freed immediately by
>> kmem_cache_return_sheaf(). kmem_cache_refill_sheaf() might be expecially
>> ineffective when replacing a sheaf with a new one of a larger capacity.
>> It is therefore better to size cache's sheaf_capacity accordingly.
>> 
>> Signed-off-by: Vlastimil Babka <vbabka@...e.cz>
>> ---
>>  include/linux/slab.h |  16 ++++
>>  mm/slub.c            | 227 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 243 insertions(+)
> 
> [... snip ... ]
> 
>> @@ -4831,6 +4857,207 @@ void *kmem_cache_alloc_node_noprof(struct kmem_cache *s, gfp_t gfpflags, int nod
>>  }
>>  EXPORT_SYMBOL(kmem_cache_alloc_node_noprof);
>>  
>> +
>> +/*
>> + * returns a sheaf that has least the requested size
>> + * when prefilling is needed, do so with given gfp flags
>> + *
>> + * return NULL if sheaf allocation or prefilling failed
>> + */
>> +struct slab_sheaf *
>> +kmem_cache_prefill_sheaf(struct kmem_cache *s, gfp_t gfp, unsigned int size)
>> +{
>> +	struct slub_percpu_sheaves *pcs;
>> +	struct slab_sheaf *sheaf = NULL;
>> +
>> +	if (unlikely(size > s->sheaf_capacity)) {
>> +		sheaf = kzalloc(struct_size(sheaf, objects, size), gfp);
>> +		if (!sheaf)
>> +			return NULL;
>> +
>> +		sheaf->cache = s;
>> +		sheaf->capacity = size;
>> +
>> +		if (!__kmem_cache_alloc_bulk(s, gfp, size,
>> +					     &sheaf->objects[0])) {
>> +			kfree(sheaf);
>> +			return NULL;
>> +		}
>> +
>> +		sheaf->size = size;
>> +
>> +		return sheaf;
>> +	}
>> +
>> +	localtry_lock(&s->cpu_sheaves->lock);
>> +	pcs = this_cpu_ptr(s->cpu_sheaves);
>> +
>> +	if (pcs->spare) {
>> +		sheaf = pcs->spare;
>> +		pcs->spare = NULL;
>> +	}
>> +
>> +	if (!sheaf)
>> +		sheaf = barn_get_full_or_empty_sheaf(pcs->barn);
> 
> Can this be outside localtry lock?

Strictly speaking we'd have to save the barn pointer first, otherwise cpu
hotremove could bite us, I think. But not worth the trouble, as localtry
lock is just disabling preemption and taking the barn lock would disable
irqs anyway. So we're not increasing contention by holding the localtry lock
more than strictly necessary.

> 
>> +
>> +	localtry_unlock(&s->cpu_sheaves->lock);
>> +
>> +	if (!sheaf) {
>> +		sheaf = alloc_empty_sheaf(s, gfp);
>> +	}
>> +
>> +	if (sheaf && sheaf->size < size) {
>> +		if (refill_sheaf(s, sheaf, gfp)) {
>> +			sheaf_flush(s, sheaf);
>> +			free_empty_sheaf(s, sheaf);
>> +			sheaf = NULL;
>> +		}
>> +	}
>> +
>> +	if (sheaf)
>> +		sheaf->capacity = s->sheaf_capacity;
>> +
>> +	return sheaf;
>> +}
>> +
>> +/*
>> + * Use this to return a sheaf obtained by kmem_cache_prefill_sheaf()
>> + * It tries to refill the sheaf back to the cache's sheaf_capacity
>> + * to avoid handling partially full sheaves.
>> + *
>> + * If the refill fails because gfp is e.g. GFP_NOWAIT, the sheaf is
>> + * instead dissolved
>> + */
>> +void kmem_cache_return_sheaf(struct kmem_cache *s, gfp_t gfp,
>> +			     struct slab_sheaf *sheaf)
>> +{
>> +	struct slub_percpu_sheaves *pcs;
>> +	bool refill = false;
>> +	struct node_barn *barn;
>> +
>> +	if (unlikely(sheaf->capacity != s->sheaf_capacity)) {
>> +		sheaf_flush(s, sheaf);
>> +		kfree(sheaf);
>> +		return;
>> +	}
>> +
>> +	localtry_lock(&s->cpu_sheaves->lock);
>> +	pcs = this_cpu_ptr(s->cpu_sheaves);
>> +
>> +	if (!pcs->spare) {
>> +		pcs->spare = sheaf;
>> +		sheaf = NULL;
>> +	} else if (pcs->barn->nr_full >= MAX_FULL_SHEAVES) {
> 
> Did you mean (pcs->barn->nr_full < MAX_FULL_SHEAVES)?

Oops yeah, fixing this can potentially improve performance.

> Otherwise looks good to me.

Thanks a lot!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ