lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <487639ED.7000502@zytor.com>
Date:	Thu, 10 Jul 2008 12:33:49 -0400
From:	"H. Peter Anvin" <hpa@...or.com>
To:	Christoph Lameter <cl@...ux-foundation.org>
CC:	Jeremy Fitzhardinge <jeremy@...p.org>,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	Ingo Molnar <mingo@...e.hu>, Mike Travis <travis@....com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Jack Steiner <steiner@....com>, linux-kernel@...r.kernel.org,
	Arjan van de Ven <arjan@...radead.org>
Subject: Re: [RFC 00/15] x86_64: Optimize percpu accesses

Christoph Lameter wrote:
> H. Peter Anvin wrote:
> 
>> but there is a distinct lack of wiggle room, which can be resolved
>> either by using negative offsets, or by moving the kernel text area up a
>> bit from -2 GB.
> 
> Lets say we reserve 256MB of cpu alloc space per processor.
> 
> On a system with 4k processors this will result in the need for 1TB virtual address space for per cpu areas (note that there may be more processors in the future). Preferably we would calculate the address of the per cpu area by
> 
> 	PERCPU_START_ADDRESS + PERCPU_SIZE * smp_processor_id()
> 
> instead of looking it up in a table because that will save a memory access on per_cpu().

It will, but it might still be a net loss due to higher load on the TLB 
(you're effectively using the TLB to do the table lookup for you.)  On 
the other hand, Mike points out that once we move away from fixed-sized 
segments we pretty much have to use virtual addresses anyway(*).

> The first percpu area would ideally be the per cpu segment generated by the linker.
> 
> How would that fit into the address map? In particular the 2G distance between code and the first per cpu area must not be violated unless we go to a zero based approach.

If with "zero-based" you mean "nonzero gs_base for the boot CPU" then 
yes, you're right.

Note again that that is completely orthogonal to RIP-based versus absolute.

	-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ