lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 12 Jan 2009 09:44:58 -0800
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Christoph Lameter <cl@...ux-foundation.org>
Cc:	Rusty Russell <rusty@...tcorp.com.au>, Tejun Heo <tj@...nel.org>,
	Ingo Molnar <mingo@...e.hu>, travis@....com,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	"H. Peter Anvin" <hpa@...or.com>,
	Andrew Morton <akpm@...ux-foundation.org>, steiner@....com,
	Hugh Dickins <hugh@...itas.com>
Subject: Re: regarding the x86_64 zero-based percpu patches

Christoph Lameter <cl@...ux-foundation.org> writes:

> On Sat, 10 Jan 2009, Rusty Russell wrote:
>
>> > As I was trying to do more stuff per-cpu
>> > (not putting a lot of stuff into per-cpu area but even with small
>> > things limited per-cpu area poses scalability problems), cpu_alloc
>> > seems to fit the bill better.
>>
>> Unfortunately cpu_alloc didn't solve this problem either.
>>
>> We need to grow the areas, but for NUMA layouts it's non-trivial.  I don't
>> like the idea of remapping: one TLB entry per page per cpu is going to suck.
>> Finding pages which are "congruent" with the original percpu pages is more
>> promising, but it will almost certainly need to elbow pages out the way to
>> have a chance of succeeding on a real system.
>
> An allocation automatically falls back to the nearest node on NUMA
> cpu_to_node() gives you the current node.
>
> There are 2M TLB entries on x86_64. If we really get into a high usage
> scenario then the 2M entry makes sense. Average server memory sizes likely
> already are way beyond 10G per box. The higher that goes the more
> reasonable the 2M TLB entry will be.

2M of per cpu data doesn't make sense, and likely indicates a design
flaw somewhere.  It just doesn't make sense to have large amounts of
data allocated per cpu.

The most common user of per cpu data I am aware of is allocating one
word per cpu for counters.

What would be better is simply to: 
- Require a lock to access another cpus per cpu data.
- Do large page allocations for the per cpu data.

At which point we could grow the per cpu data by simply reallocating it on
each cpu and updating the register that holds the base pointer.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ