[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20080912131816.e0cfac7a.akpm@linux-foundation.org>
Date: Fri, 12 Sep 2008 13:18:16 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Andrea Righi <righi.andrea@...il.com>
Cc: balbir@...ux.vnet.ibm.com, menage@...gle.com,
kamezawa.hiroyu@...fujitsu.com, dave@...ux.vnet.ibm.com,
chlunde@...g.uio.no, dpshah@...gle.com, eric.rannaud@...il.com,
fernando@....ntt.co.jp, agk@...rceware.org, m.innocenti@...eca.it,
s-uchida@...jp.nec.com, ryov@...inux.co.jp, matt@...ehost.com,
dradford@...ehost.com, containers@...ts.linux-foundation.org,
linux-kernel@...r.kernel.org, Michael Rubin <mrubin@...gle.com>
Subject: Re: [RFC] [PATCH -mm 0/2] memcg: per cgroup dirty_ratio
On Fri, 12 Sep 2008 17:09:50 +0200
Andrea Righi <righi.andrea@...il.com> wrote:
>
> The goal of the patch is to control how much dirty file pages a cgroup can have
> at any given time (see also [1]).
>
> Dirty file and writeback pages are accounted for each cgroup using the memory
> controller statistics. Moreover, the dirty_ratio parameter is added to the
> memory controller. It contains, as a percentage of the cgroup memory, the
> number of dirty pages at which the processes belonging to the cgroup which are
> generating disk writes will start writing out dirty data.
>
> So, the behaviour is actually the same as the global dirty_ratio, except that
> it works per cgroup.
>
> Interface:
> - two new entries "writeback" and "filedirty" are added to the file
> memory.stat, to export to userspace respectively the number of pages under
> writeback and the number of dirty file pages in the cgroup
>
> - the new file memory.dirty_ratio is added in the cgroup filesystem to show/set
> the memcg dirty_ratio
Seems like a desirable objective.
> [ This patch is still experimental and I only did few quick tests. I'd like to
> do run more detailed benchmarks and compare the results, I guess the overhead
> introduced by this patch shouldn't be so small... and BTW I would prefer a
> dirty limit in bytes, intead of using a percentage of memory. Bytes are hugely
> more flexible IMHO, they allow to define more fine-grained limits and so this
> would work better on large memory machines. ]
>
> [1] http://lkml.org/lkml/2008/9/9/245
I tend to duck experimental and rfc patches ;)
One thing to think about please: Michael Rubin is hitting problems with
the existing /proc/sys/vm/dirty-ratio. Its present granularity of 1%
is just too coarse for really large machines, and as
memory-size/disk-speed ratios continue to increase, this will just get
worse.
So after thinking about it a bit I encouraged him to propose a patch
which adds a new /proc/sys/vm/hires-dirty-ratio (for some value of
"hires" ;)) which simply offers a higher-resolution interface to the
same internal kernel machinery.
How does this affect you? I don't think we should be adding new
interfaces which have the old 1%-resolution problem. Once we get this
higher-resolution interface sorted out, your new interface should do it
the same way.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists