[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0901121120280.30369@quilx.com>
Date: Mon, 12 Jan 2009 11:23:27 -0600 (CST)
From: Christoph Lameter <cl@...ux-foundation.org>
To: Rusty Russell <rusty@...tcorp.com.au>
cc: Tejun Heo <tj@...nel.org>, Ingo Molnar <mingo@...e.hu>,
travis@....com,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
"H. Peter Anvin" <hpa@...or.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Eric Biederman <ebiederm@...ssion.com>, steiner@....com,
Hugh Dickins <hugh@...itas.com>
Subject: Re: regarding the x86_64 zero-based percpu patches
On Sat, 10 Jan 2009, Rusty Russell wrote:
> > As I was trying to do more stuff per-cpu
> > (not putting a lot of stuff into per-cpu area but even with small
> > things limited per-cpu area poses scalability problems), cpu_alloc
> > seems to fit the bill better.
>
> Unfortunately cpu_alloc didn't solve this problem either.
>
> We need to grow the areas, but for NUMA layouts it's non-trivial. I don't
> like the idea of remapping: one TLB entry per page per cpu is going to suck.
> Finding pages which are "congruent" with the original percpu pages is more
> promising, but it will almost certainly need to elbow pages out the way to
> have a chance of succeeding on a real system.
An allocation automatically falls back to the nearest node on NUMA
cpu_to_node() gives you the current node.
There are 2M TLB entries on x86_64. If we really get into a high usage
scenario then the 2M entry makes sense. Average server memory sizes likely
already are way beyond 10G per box. The higher that goes the more
reasonable the 2M TLB entry will be.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists