[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1289832427.2607.84.camel@edumazet-laptop>
Date: Mon, 15 Nov 2010 15:47:07 +0100
From: Eric Dumazet <eric.dumazet@...il.com>
To: Christoph Lameter <cl@...ux.com>
Cc: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
Andrew Morton <akpm@...ux-foundation.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
David Miller <davem@...emloft.net>,
netdev <netdev@...r.kernel.org>,
Arnaldo Carvalho de Melo <acme@...radead.org>,
Ingo Molnar <mingo@...e.hu>, Andi Kleen <andi@...stfloor.org>,
Nick Piggin <npiggin@...nel.dk>
Subject: Re: [PATCH] atomic: add atomic_inc_not_zero_hint()
Le lundi 15 novembre 2010 à 08:25 -0600, Christoph Lameter a écrit :
> On Mon, 15 Nov 2010, Eric Dumazet wrote:
>
> > Exclusive access ? As soon as another cpu takes it again, you lose.
>
> Sure but you want to avoid the fetch in shared mode here.
>
Yes, this is what cmpxchg() does for sure.
> > Its not really the same thing... Maybe you miss the 'hint' intention at
> > all. We know the probable value of the counter, we dont want to read it.
>
> Ok may be in thise case you can predict the value but in general it is
> difficult to always provide an expected value. It would be easier to be
> able to tell the processor that the cacheline should not be fetched as
> shared but immediately in exclusive state.
>
Maybe its not clear, but atomic_inc_not_zero_hint() is going to be used
only in contexts we know the expected value, and not as a generic
replacement for atomic_inc_not_zero(). Even if cache line is already hot
in this cpu cache, it should be faster or same speed.
Then, in high contention contexts, using atomic_inc_not_zero_hint() with
whatever initial hint might also be a win over atomic_inc_not_zero(),
but we try to remove such contexts ;)
And two atomic_cmpxchg() are probably slower in non contended contexts,
in particular is cache line is already hot in this cpu cache.
> > atomic_read() and atomic_cmpxchg(). We tried prefetchw() and it was a
> > performance drop. It was with only 16 cpus contending on neighbour
>
> Does prefetchw work? Andi claims that prefetchw is not working on
> x86 and I doubt that you ran tests on Itanium.
In fact, in benchmarks, prefetch() or prefetchw() are a pain on x86, or
at least "perf tools" show artifact on them (high number of cycles
consumed on these instructions)
Andi had a patch to disable prefetch() in list iterators, and its a win.
I dont have Itanium platform to run tests. Is cmpxchg() that bad on
ia64 ? I also have old AMD cpus, so I cannot say if recent ones handle
prefetchw() better...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists