[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200812121847.06432.rusty@rustcorp.com.au>
Date: Fri, 12 Dec 2008 18:47:05 +1030
From: Rusty Russell <rusty@...tcorp.com.au>
To: Eric Dumazet <dada1@...mosbay.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Theodore Tso <tytso@....edu>,
linux kernel <linux-kernel@...r.kernel.org>,
"David S. Miller" <davem@...emloft.net>,
Mingming Cao <cmm@...ibm.com>, linux-ext4@...r.kernel.org,
Christoph Lameter <clameter@....com>
Subject: Re: [PATCH] percpu_counter: Fix __percpu_counter_sum()
On Thursday 11 December 2008 09:26:37 Eric Dumazet wrote:
> But then, some (all but x86 ;) ) arches dont have true local_t and we fallback
> to plain atomic_long_t, and this is wrong because it would add a LOCKED
> instruction in fast path.
>
> I remember Christoph added FAST_CMPXCHG_LOCAL, but no more uses of it in current
> tree.
>
> Ie : using local_t only if CONFIG_FAST_CMPXCHG_LOCAL, else something like :
>
> void __percpu_counter_add_irqsafe(struct percpu_counter *fbc, s64 amount, s32 batch)
> {
> s64 count;
> s32 *pcount = per_cpu_ptr(fbc->counters, get_cpu());
> unsigned long flags;
>
> local_irq_save(flags);
> count = *pcount + amount;
This is dumb though. If local_irq_save(), add, local_irq_restore() is faster
than atomic_long_add on some arch, *that* is what that arch's local_add()
should do!
Open coding it like this is obviously wrong.
Now, archs local.h need attention (x86-32 can be optimized today, for
example), but that's not directly related.
Hope that clarifies,
Rusty.
PS. Yes, I should produce a documentation patch and fix the x86 version.
Added to TODO list.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists