[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20061101042341.83cbd77e.pj@sgi.com>
Date: Wed, 1 Nov 2006 04:23:41 -0800
From: Paul Jackson <pj@....com>
To: balbir@...ibm.com
Cc: menage@...gle.com, dev@...nvz.org, vatsa@...ibm.com,
sekharan@...ibm.com, ckrm-tech@...ts.sourceforge.net,
haveblue@...ibm.com, linux-kernel@...r.kernel.org,
matthltc@...ibm.com, dipankar@...ibm.com, rohitseth@...gle.com
Subject: Re: [ckrm-tech] RFC: Memory Controller
Balbir wrote:
> Paul Menage wrote:
> > On 10/31/06, Balbir Singh <balbir@...ibm.com> wrote:
> >> I thought this would be hard to do in general, but with a page -->
> >> container mapping that will come as a result of the memory controller,
> >> will it still be that hard?
> >
> > I meant that it's pretty much impossible with the current APIs
> > provided by the kernel. That's why one of the most useful things that
> > a memory controller can provide is accounting and limiting of page
> > cache usage.
> >
> > Paul
>
> Thanks for clarifying that! I completely agree, page cache control is
> very important!
Doesn't "zone_reclaim" (added by Christoph Lameter over the last
several months) go a long way toward resolving this page cache control
problem?
Essentially, if my understanding is correct, zone reclaim has tasks
that are asking for memory first do some work towards keeping enough
memory free, such as doing some work reclaiming slab memory and pushing
swap and pushing dirty buffers to disk.
Tasks must help out as is needed to keep the per-node free memory above
watermarks.
This way, you don't actually have to account for who owns what, with
all the problems arbitrating between claims on shared resources.
Rather, you just charge the next customer who comes in the front
door (aka, mm/page_alloc.c:__alloc_pages()) a modest overhead if
they happen to show up when free memory supplies are running short.
On average, it has the same affect as a strict accounting system,
of charging the heavy users more (more CPU cycles in kernel vmscan
code and clock cycles waiting on disk heads). But it does so without
any need of accurate per-user accounting.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@....com> 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists