lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 15 Jan 2009 19:26:24 +0900
From:	Tejun Heo <tj@...nel.org>
To:	Ingo Molnar <mingo@...e.hu>
CC:	"H. Peter Anvin" <hpa@...or.com>, Brian Gerst <brgerst@...il.com>,
	ebiederm@...ssion.com, cl@...ux-foundation.org,
	rusty@...tcorp.com.au, travis@....com,
	linux-kernel@...r.kernel.org, akpm@...ux-foundation.org,
	steiner@....com, hugh@...itas.com
Subject: Re: [patch] add optimized generic percpu accessors

Hello, Ingo.

Ingo Molnar wrote:
> Tejun, could you please also add the patch below to your lineup too?

Sure thing.

> It is an optimization and a cleanup, and adds the following new generic 
> percpu methods:
> 
>   percpu_read()
>   percpu_write()
>   percpu_add()
>   percpu_sub()
>   percpu_or() 
>   percpu_xor()
> 
> and implements support for them on x86. (other architectures will fall 
> back to a default implementation)
> 
> The advantage is that for example to read a local percpu variable, instead 
> of this sequence:
> 
>  return __get_cpu_var(var);
> 
>  ffffffff8102ca2b:	48 8b 14 fd 80 09 74 	mov    -0x7e8bf680(,%rdi,8),%rdx
>  ffffffff8102ca32:	81 
>  ffffffff8102ca33:	48 c7 c0 d8 59 00 00 	mov    $0x59d8,%rax
>  ffffffff8102ca3a:	48 8b 04 10          	mov    (%rax,%rdx,1),%rax
> 
> We can get a single instruction by using the optimized variants:
> 
>  return percpu_read(var);
> 
>  ffffffff8102ca3f:	65 48 8b 05 91 8f fd 	mov    %gs:0x7efd8f91(%rip),%rax
> 
> I also cleaned up the x86-specific APIs and made the x86 code use these 
> new generic percpu primitives.
> 
> It looks quite hard to convince the compiler to generate the optimized 
> single-instruction sequence for us out of __get_cpu_var(var) - or can you 
> perhaps see a way to do it?

Yeah, I thought about that too but couldn't think of a way to persuade
the compiler because the compiler doesn't know how to access the
address.  I'll play with it a bit more but the clumsy percpu_*()
accessors probably might be the only way.  :-(

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ