lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 28 Feb 2008 11:08:49 -0800 (PST)
From:	Christoph Lameter <clameter@....com>
To:	Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
cc:	Eric Dumazet <dada1@...mosbay.com>,
	Pekka Enberg <penberg@...helsinki.fi>,
	Torsten Kaiser <just.for.lkml@...glemail.com>,
	Ingo Molnar <mingo@...e.hu>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] Implement slub fastpath in terms of freebase and freeoffset

On Thu, 28 Feb 2008, Mathieu Desnoyers wrote:

> Yep, I have an experimental patch ready. It would need some thorough
> testing though. I seems to run fine on an x86_32 with and without
> preemption (therefore with and without the slub cmpxchg_local fastpath).

What is the performance impact?

More comments follow...


> freeoffset & PAGE_MASK is the offset of the freelist element within the page.
> freeoffset & ~PAGE_MASK is the counter to detect object re-use.
> freebase is the address of the page in this the object is located. It is a
> multiple of PAGE_SIZE.

How does the page mask work for order 1 and other higher order allocations?
 
> Whenever an object is freed in the cpu slab cache, the counter is incremented.
> Whenever the alloc/free slow paths are modifying the freeoffset or freebase, the
> freeoffset counter is also incremented. It is used to make sure we know if
> freebase has been modified in an interrupt nested over the fast path.

> +	unsigned long freebase;	/*
> +				 * Pointer to the page address
> +				 * containing the first free per cpu
> +				 * object.
> +				 */

This is page_address(c->page)?

>  	struct page *page;	/* The slab from which we are allocating */
>  	int node;		/* The node of the page (or -1 for debug) */
>  	unsigned int offset;	/* Freepointer offset (in word units) */

> +/*
> + * Used when interrupts are disabled, so no memory barriers are needed.
> + */
> +static inline void **get_freelist_ptr(struct kmem_cache_cpu *c)
> +{
> +	return (void **)(c->freebase | (c->freeoffset & PAGE_MASK));
> +}

page_address(c->page) + (c->freeoffset & (PAGE_MASK << s->order)) ?

store the page mask also in the kmem_cache_cpu structure?

> +/*
> + * Updates the freebase and freeoffset concurrently with slub fastpath.
> + * We can nest over the fastpath as an interrupt. Therefore, all the freebase
> + * and freeoffset modifications will appear to have happened atomically from the
> + * fastpath point of view. (those are per-cpu variables)
> + */
> +static inline void set_freelist_ptr(struct kmem_cache_cpu *c, void **freelist)
> +{
> +	c->freebase = (unsigned long)freelist & ~PAGE_MASK;
> +	c->freeoffset += PAGE_SIZE;
> +	c->freeoffset &= ~PAGE_MASK;
> +	c->freeoffset |= (unsigned long)freelist & PAGE_MASK;
> +}

How is this going to show up as an atomic update? You modify multiple per 
cpu fields and an interrupt could happen in between.

> @@ -1411,7 +1411,7 @@ static void deactivate_slab(struct kmem_
>  	struct page *page = c->page;
>  	int tail = 1;
>  
> -	if (c->freelist)
> +	if (get_freelist_ptr(c))
>  		stat(c, DEACTIVATE_REMOTE_FREES);
>  	/*

The original code is wrong. It should be

if (is_end(c->freelist))
	stat(c, DEACTIVATE_REMOTE_FREES)


> -		c->freelist = c->freelist[c->offset];
> +		object = get_freelist_ptr(c);
> +		set_freelist_ptr(c, object[c->offset]);

That use of set_freelist is safe since interrupts are disabled. So the 
function can only be used i interrupts are disabled?

> +		/* No need to increment the MSB counter here, because only
> +		 * object free will lead to counter re-use. */
> +	} while (cmpxchg_local(&c->freeoffset, freeoffset,
> +		(freeoffset + (c->offset * sizeof(void *)))) != freeoffset);

How does this work? You replace the freeoffset with the the once 
incremented by the object offset??? The objects may not be in sequence on 
the freelist and the increment wont get you to the next object anyways 
since c->offset is usually 0.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ