linux-kernel - Re: [thiscpuops upgrade 10/10] Lockless (and preemptless) fastpaths for slub

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20101124010554.GC8264@Krystal>
Date:	Tue, 23 Nov 2010 20:05:55 -0500
From:	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To:	Christoph Lameter <cl@...ux.com>
Cc:	akpm@...ux-foundation.org, Pekka Enberg <penberg@...helsinki.fi>,
	Ingo Molnar <mingo@...e.hu>,
	Peter Zijlstra <peterz@...radead.org>,
	linux-kernel@...r.kernel.org,
	Eric Dumazet <eric.dumazet@...il.com>,
	Tejun Heo <tj@...nel.org>
Subject: Re: [thiscpuops upgrade 10/10] Lockless (and preemptless)
	fastpaths for slub

* Mathieu Desnoyers (mathieu.desnoyers@...icios.com) wrote:
> * Christoph Lameter (cl@...ux.com) wrote:
> 
> [...]
> 
> > @@ -1737,23 +1770,53 @@ static __always_inline void *slab_alloc(
> >  {
> >  	void **object;
> >  	struct kmem_cache_cpu *c;
> > -	unsigned long flags;
> > +	unsigned long tid;
> >  
> >  	if (slab_pre_alloc_hook(s, gfpflags))
> >  		return NULL;
> >  
> > -	local_irq_save(flags);
> > +redo:
> > +	/*
> > +	 * Must read kmem_cache cpu data via this cpu ptr. Preemption is
> > +	 * enabled. We may switch back and forth between cpus while
> > +	 * reading from one cpu area. That does not matter as long
> > +	 * as we end up on the original cpu again when doing the cmpxchg.
> > +	 */
> >  	c = __this_cpu_ptr(s->cpu_slab);
> > +
> > +	/*
> > +	 * The transaction ids are globally unique per cpu and per operation on
> > +	 * a per cpu queue. Thus they can be guarantee that the cmpxchg_double
> > +	 * occurs on the right processor and that there was no operation on the
> > +	 * linked list in between.
> > +	 */
> 
> There seems to be some voodoo magic I don't understand here. I'm curious to see
> what happens if we have:
> 
> CPU A                                                  CPU B
> slab_alloc()
>   c = __this_cpu_ptr(s->cpu_slab);
>   tid = c->tid
>   thread migrated to CPU B
> 
> slab_alloc()
>   c = __this_cpu_ptr(s->cpu_slab);
>   tid = c->tid
>   ...                                                  ...
>   irqsafe_cmpxchg_double
>     - expect tid, on CPU A, success
>                                                        migrate back to CPU A
>   irqsafe_cmpxchg_double
>     - expect (same) tid, on CPU A, success

Ah! I knew I was missing something: the second cmpxchg will fail because it
expects "tid", but the value is now the "next_tid". So effectively, many
instances of the same transaction can run concurrently, but only one will
succeed.

Sorry for the noise.

Thanks,

Mathieu


> 
> So either there is a crucially important point I am missing, or the transaction
> ID does not seem to be truly unique due to migration.
> 
> Thanks,
> 
> Mathieu
> 
> 
> > +	tid = c->tid;
> > +	barrier();
> > +
> >  	object = c->freelist;
> > -	if (unlikely(!object || !node_match(c, node)))
> > +	if (unlikely(!object || !node_match(c, c->node)))
> >  
> > -		object = __slab_alloc(s, gfpflags, node, addr, c);
> > +		object = __slab_alloc(s, gfpflags, c->node, addr);
> >  
> >  	else {
> > -		c->freelist = get_freepointer(s, object);
> > +		/*
> > +		 * The cmpxchg will only match if there was not additonal
> > +		 * operation and if we are on the right processor.
> > +		 */
> > +		if (unlikely(!irqsafe_cmpxchg_double(&s->cpu_slab->freelist, object, tid,
> > +				get_freepointer(s, object), next_tid(tid)))) {
> 
> 
> -- 
> Mathieu Desnoyers
> Operating System Efficiency R&D Consultant
> EfficiOS Inc.
> http://www.efficios.com

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/