[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <497E705B.5000302@kernel.org>
Date: Tue, 27 Jan 2009 11:24:27 +0900
From: Tejun Heo <tj@...nel.org>
To: Rusty Russell <rusty@...tcorp.com.au>
CC: Ingo Molnar <mingo@...e.hu>,
Herbert Xu <herbert@...dor.apana.org.au>,
akpm@...ux-foundation.org, hpa@...or.com, brgerst@...il.com,
ebiederm@...ssion.com, cl@...ux-foundation.org, travis@....com,
linux-kernel@...r.kernel.org, steiner@....com, hugh@...itas.com,
"David S. Miller" <davem@...emloft.net>, netdev@...r.kernel.org,
Mathieu Desnoyers <mathieu.desnoyers@...ymtl.ca>
Subject: Re: [PATCH] percpu: add optimized generic percpu accessors
Hello, Rusty.
Rusty Russell wrote:
>> No, they're not. They're preempt safe as mentioned in the comment
>> and is basically just generalization of the original x86 versions
>> used by x86_64 on SMP before pda and percpu areas were merged. I
>> agree that it's something very close to local_t and it would be
>> nice to see those somehow unified (and I have patches which make
>> use of local_t in my queue waiting for dynamic percpu allocation).
>
> Yes, which is one reason I dislike Ingo's patch:
> 1) Mine did just read because that covers the most common fast-path use
> and is easily atomic for word-sizes on all archs,
> 2) Didn't replace x86, just #defined generic one, so much less churn,
> 3) read_percpu_var and read_percpu_ptr variants following the convention
> reinforced by my other patches.
>
> Linus' tree had read/write/add/or counts at 22/13/0/0. Yours has
> more write usage, so I'm happy there, but still only one add and one
> or. If we assume that generic code will look a bit like that when
> converted, I'm not convinced that generic and/or/etc ops are worth
> it.
There actually were quite some places where atomic add ops would be
useful, especially the places where statistics are collected. For
logical bitops, I don't think we'll have too many of them.
> If they are worth doing generically, should the ops be atomic? To
> extrapolate from x86 usages again, it seems to be happy with
> non-atomic (tho of course it is atomic on x86).
If atomic rw/add/sub are implementible on most archs (and judging from
local_t, I suppose it is), I think it should. So that it can replace
local_t and we won't need something else again in the future.
>> Another question to ask is whether to keep using separate
>> interfaces for static and dynamic percpu variables or migrate to
>> something which can take both.
>
> Well, IA64 can do stuff with static percpus that it can't do with
> dynamic (assuming we get expanding dynamic percpu areas
> later). That's because they use TLB tricks for a static 64k per-cpu
> area, but this doesn't scale. That might not be vital: abandoning
> that trick will mean they can't optimise read_percpu/read_percpu_var
> etc as much.
Isn't something like the following possible?
#define pcpu_read(ptr) \
({ \
if (__builtin_constant_p(ptr) && \
ptr >= PCPU_STATIC_START && ptr < PCPU_STATIC_END) \
do 64k TLB trick for static pcpu; \
else \
do generic stuff; \
})
> Tejun, any chance of you updating the tj-percpu tree? My current
> patches are against Linus's tree, and rebasing them on yours
> involves some icky merging.
If Ingo is okay with it, I'm fine with it too. Unless Ingo objects,
I'll do it tomorrow-ish (still big holiday here).
Thanks.
--
tejun
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists