lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1173226350.4644.54.camel@localhost.localdomain>
Date:	Wed, 07 Mar 2007 11:12:30 +1100
From:	Rusty Russell <rusty@...tcorp.com.au>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	lkml - Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Zachary Amsden <zach@...are.com>,
	Jeremy Fitzhardinge <jeremy@...source.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Andi Kleen <ak@...e.de>
Subject: Re: [PATCH 8/8] Convert PDA into the percpu section

On Tue, 2007-03-06 at 14:10 +0100, Ingo Molnar wrote:
> * Rusty Russell <rusty@...tcorp.com.au> wrote:
> 
> > Currently x86 (similar to x84-64) has a special per-cpu structure 
> > called "i386_pda" which can be easily and efficiently referenced via 
> > the %fs register.  An ELF section is more flexible than a structure, 
> > allowing any piece of code to use this area.  Indeed, such a section 
> > already exists: the per-cpu area.
> > 
> > So this patch
> > (1) Removes the PDA and uses per-cpu variables for each current member.
> 
> hmm ... i very much like this, but its needs performance and kernel-size 
> testing before it can move from -mm into mainline. We are now exposing 
> wide ranges of the kernel to segment prefixes again. (Btw., i'd expect 
> there to be a kernel size reduction.)

Hi Ingo,

	Thanks!  There are some interesting issues.  Because __get_cpu_var()
returns an lvalue, we don't use the %fs:value directly, but calculate
offset (%fs:this_cpu_off + &value).  So previously there was only a tiny
code reduction.

	If we used __thread, then gcc could do this optimization for us when it
knows an rvalue is needed, however:

1) gcc wants to use %gs, not %fs, which is measurably slower for the
kernel,
2) gcc wants to use huge offsets to store the address of the per-cpu
space, and this breaks Xen (and current lguest, but new lguest no longer
uses segments for protection)

One solution would be to expose x86_read_percpu() as read_percpu() and
implement it in asm-generic/percpu.h as well, then use it in places
where only an rvalue is required.

Cheers!
Rusty.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ