[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <48CAF583.8060406@gmail.com>
Date: Sat, 13 Sep 2008 01:04:35 +0200
From: Andrea Righi <righi.andrea@...il.com>
To: Andrew Morton <akpm@...ux-foundation.org>
CC: balbir@...ux.vnet.ibm.com, menage@...gle.com,
kamezawa.hiroyu@...fujitsu.com, dave@...ux.vnet.ibm.com,
chlunde@...g.uio.no, dpshah@...gle.com, eric.rannaud@...il.com,
fernando@....ntt.co.jp, agk@...rceware.org, m.innocenti@...eca.it,
s-uchida@...jp.nec.com, ryov@...inux.co.jp, matt@...ehost.com,
dradford@...ehost.com, containers@...ts.linux-foundation.org,
linux-kernel@...r.kernel.org, Michael Rubin <mrubin@...gle.com>
Subject: Re: [RFC] [PATCH -mm 0/2] memcg: per cgroup dirty_ratio
Andrew Morton wrote:
> On Fri, 12 Sep 2008 17:09:50 +0200
> Andrea Righi <righi.andrea@...il.com> wrote:
>
>> The goal of the patch is to control how much dirty file pages a cgroup can have
>> at any given time (see also [1]).
>>
>> Dirty file and writeback pages are accounted for each cgroup using the memory
>> controller statistics. Moreover, the dirty_ratio parameter is added to the
>> memory controller. It contains, as a percentage of the cgroup memory, the
>> number of dirty pages at which the processes belonging to the cgroup which are
>> generating disk writes will start writing out dirty data.
>>
>> So, the behaviour is actually the same as the global dirty_ratio, except that
>> it works per cgroup.
>>
>> Interface:
>> - two new entries "writeback" and "filedirty" are added to the file
>> memory.stat, to export to userspace respectively the number of pages under
>> writeback and the number of dirty file pages in the cgroup
>>
>> - the new file memory.dirty_ratio is added in the cgroup filesystem to show/set
>> the memcg dirty_ratio
>
> Seems like a desirable objective.
>
>> [ This patch is still experimental and I only did few quick tests. I'd like to
>> do run more detailed benchmarks and compare the results, I guess the overhead
>> introduced by this patch shouldn't be so small... and BTW I would prefer a
>> dirty limit in bytes, intead of using a percentage of memory. Bytes are hugely
>> more flexible IMHO, they allow to define more fine-grained limits and so this
>> would work better on large memory machines. ]
>>
>> [1] http://lkml.org/lkml/2008/9/9/245
>
> I tend to duck experimental and rfc patches ;)
>
> One thing to think about please: Michael Rubin is hitting problems with
> the existing /proc/sys/vm/dirty-ratio. Its present granularity of 1%
> is just too coarse for really large machines, and as
> memory-size/disk-speed ratios continue to increase, this will just get
> worse.
>
> So after thinking about it a bit I encouraged him to propose a patch
> which adds a new /proc/sys/vm/hires-dirty-ratio (for some value of
> "hires" ;)) which simply offers a higher-resolution interface to the
> same internal kernel machinery.
>
> How does this affect you? I don't think we should be adding new
> interfaces which have the old 1%-resolution problem. Once we get this
> higher-resolution interface sorted out, your new interface should do it
> the same way.
Totally agree.
The hires-dirty-ratio interface seems much better. I'll follow the progresses
of this new interface, reusing the same way in my patch doesn't look too difficult,
in any case.
BTW why not use a simple dirty-ratio-in-bytes?
Thanks for commenting,
-Andrea
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists