lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Mon, 15 Nov 2010 15:47:07 +0100
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Christoph Lameter <cl@...ux.com>
Cc:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	David Miller <davem@...emloft.net>,
	netdev <netdev@...r.kernel.org>,
	Arnaldo Carvalho de Melo <acme@...radead.org>,
	Ingo Molnar <mingo@...e.hu>, Andi Kleen <andi@...stfloor.org>,
	Nick Piggin <npiggin@...nel.dk>
Subject: Re: [PATCH] atomic: add atomic_inc_not_zero_hint()

Le lundi 15 novembre 2010 à 08:25 -0600, Christoph Lameter a écrit :
> On Mon, 15 Nov 2010, Eric Dumazet wrote:
> 
> > Exclusive access ? As soon as another cpu takes it again, you lose.
> 
> Sure but you want to avoid the fetch in shared mode here.
> 

Yes, this is what cmpxchg() does for sure.

> > Its not really the same thing... Maybe you miss the 'hint' intention at
> > all. We know the probable value of the counter, we dont want to read it.
> 
> Ok may be in thise case you can predict the value but in general it is
> difficult to always provide an expected value. It would be easier to be
> able to tell the processor that the cacheline should not be fetched as
> shared but immediately in exclusive state.
> 

Maybe its not clear, but atomic_inc_not_zero_hint() is going to be used
only in contexts we know the expected value, and not as a generic
replacement for atomic_inc_not_zero(). Even if cache line is already hot
in this cpu cache, it should be faster or same speed.

Then, in high contention contexts, using atomic_inc_not_zero_hint() with
whatever initial hint might also be a win over atomic_inc_not_zero(),
but we try to remove such contexts ;)

And two atomic_cmpxchg() are probably slower in non contended contexts,
in particular is cache line is already hot in this cpu cache.

> > atomic_read() and atomic_cmpxchg(). We tried prefetchw() and it was a
> > performance drop. It was with only 16 cpus contending on neighbour
> 
> Does prefetchw work? Andi claims that prefetchw is not working on
> x86 and I doubt that you ran tests on Itanium.

In fact, in benchmarks, prefetch() or prefetchw() are a pain on x86, or
at least "perf tools" show artifact on them (high number of cycles
consumed on these instructions)

Andi had a patch to disable prefetch() in list iterators, and its a win.

I dont have Itanium platform to run tests. Is cmpxchg() that bad on
ia64 ? I also have old AMD cpus, so I cannot say if recent ones handle
prefetchw() better...



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ