lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 24 Nov 2010 01:22:24 +0100
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Christoph Lameter <cl@...ux.com>
Cc:	akpm@...ux-foundation.org, Pekka Enberg <penberg@...helsinki.fi>,
	Ingo Molnar <mingo@...e.hu>,
	Peter Zijlstra <peterz@...radead.org>,
	linux-kernel@...r.kernel.org,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	Tejun Heo <tj@...nel.org>
Subject: Re: [thiscpuops upgrade 10/10] Lockless (and preemptless)
 fastpaths for slub

Le mardi 23 novembre 2010 à 17:51 -0600, Christoph Lameter a écrit :
> pièce jointe document texte brut (slub_generation)
> Use the this_cpu_cmpxchg_double functionality to implement a lockless
> allocation algorith.
> 
> Each of the per cpu pointers is paired with a transaction id that ensures
> that updates of the per cpu information can only occur in sequence on
> a certain cpu.
> 
> A transaction id is a "long" integer that is comprised of an event number
> and the cpu number. The event number is incremented for every change to the
> per cpu state. This means that the cmpxchg instruction can verify for an
> update that nothing interfered and that we are updating the percpu structure
> for the processor where we picked up the information and that we are also
> currently on that processor when we update the information.
> 
> This results in a significant decrease of the overhead in the fastpaths. It
> also makes it easy to adopt the fast path for realtime kernels since this
> is lockless and does not require that the use of the current per cpu area
> over the critical section. It is only important that the per cpu area is
> current at the beginning of the critical section and at that end.
> 
> So there is no need even to disable preemption which will make the allocations
> scale well in a RT environment.
> 
> [Beware: There have been previous attempts at lockless fastpaths that
> did not succeed. We hope to have learned from these experiences but
> review certainly is necessary.]
> 
> Cc: Ingo Molnar <mingo@...e.hu>
> Cc: Peter Zijlstra <peterz@...radead.org>
> Signed-off-by: Christoph Lameter <cl@...ux.com>
> 
> ---

>  
>  /*
> + * Calculate the next globally unique transaction for disambiguiation
> + * during cmpxchg. The transactions start with the cpu number and are then
> + * incremented by CONFIG_NR_CPUS.
> + */
> +static inline unsigned long next_tid(unsigned long tid)
> +{
> +	return tid + CONFIG_NR_CPUS;
> +}


Hmm, this only works for power of two NR_CPUS, or else one cpu 'tid'
could wrap on another cpu tid.

I suggest using 4096  (or roundup_pow_of_two(CONFIG_NR_CPUS))



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists