[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49842841.1090902@goop.org>
Date: Sat, 31 Jan 2009 02:30:25 -0800
From: Jeremy Fitzhardinge <jeremy@...p.org>
To: Ingo Molnar <mingo@...e.hu>
CC: "H. Peter Anvin" <hpa@...or.com>, Tejun Heo <tj@...nel.org>,
Brian Gerst <brgerst@...il.com>, ebiederm@...ssion.com,
cl@...ux-foundation.org, rusty@...tcorp.com.au, travis@....com,
linux-kernel@...r.kernel.org, akpm@...ux-foundation.org,
steiner@....com, hugh@...itas.com
Subject: Re: [patch] add optimized generic percpu accessors
Ingo Molnar wrote:
> Tejun, could you please also add the patch below to your lineup too?
>
> It is an optimization and a cleanup, and adds the following new generic
> percpu methods:
>
> percpu_read()
> percpu_write()
> percpu_add()
> percpu_sub()
> percpu_or()
> percpu_xor()
>
> and implements support for them on x86. (other architectures will fall
> back to a default implementation)
>
> The advantage is that for example to read a local percpu variable, instead
> of this sequence:
>
> return __get_cpu_var(var);
>
> ffffffff8102ca2b: 48 8b 14 fd 80 09 74 mov -0x7e8bf680(,%rdi,8),%rdx
> ffffffff8102ca32: 81
> ffffffff8102ca33: 48 c7 c0 d8 59 00 00 mov $0x59d8,%rax
> ffffffff8102ca3a: 48 8b 04 10 mov (%rax,%rdx,1),%rax
>
> We can get a single instruction by using the optimized variants:
>
> return percpu_read(var);
>
> ffffffff8102ca3f: 65 48 8b 05 91 8f fd mov %gs:0x7efd8f91(%rip),%rax
>
> I also cleaned up the x86-specific APIs and made the x86 code use these
> new generic percpu primitives.
>
> It looks quite hard to convince the compiler to generate the optimized
> single-instruction sequence for us out of __get_cpu_var(var) - or can you
> perhaps see a way to do it?
>
> The patch is against your latest zero-based percpu / pda unification tree.
> Untested.
I have no objection to this patch overall, or the use of non-arch
specific names.
But there is one subtle thing you're overlooking here. The x86_*_percpu
operations are guaranteed to be atomic with respect to preemption, so
you can use them when preemption is enabled. When they compile down to
one instruction then that happens naturally, otherwise they have to be
wrapped with preempt_disable/enable.
Otherwise, being able to access a percpu with one instruction is nice,
but it isn't all that efficient. If you're going to access a variable
more than once or twice, its more efficient to take the address and
access it via that.
So, upshot, I think the default versions should be wrapped in
preempt_disable/enable to preserve this interface invariant.
J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists