[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <48D26BA3.40009@gmail.com>
Date: Thu, 18 Sep 2008 16:54:27 +0200
From: Andrea Righi <righi.andrea@...il.com>
To: Vivek Goyal <vgoyal@...hat.com>
CC: Hirokazu Takahashi <taka@...inux.co.jp>, randy.dunlap@...cle.com,
menage@...gle.com, chlunde@...g.uio.no, dpshah@...gle.com,
eric.rannaud@...il.com, balbir@...ux.vnet.ibm.com,
fernando@....ntt.co.jp, akpm@...ux-foundation.org,
agk@...rceware.org, subrata@...ux.vnet.ibm.com, axboe@...nel.dk,
m.innocenti@...eca.it, containers@...ts.linux-foundation.org,
linux-kernel@...r.kernel.org, dave@...ux.vnet.ibm.com,
matt@...ehost.com, roberto@...it.it, ngupta@...gle.com
Subject: Re: [RFC][PATCH -mm 0/5] cgroup: block device i/o controller (v9)
Vivek Goyal wrote:
> On Wed, Sep 17, 2008 at 10:47:54AM +0200, Andrea Righi wrote:
>> Hirokazu Takahashi wrote:
>>> Hi,
>>>
>>>> TODO:
>>>>
>>>> * Try to push down the throttling and implement it directly in the I/O
>>>> schedulers, using bio-cgroup (http://people.valinux.co.jp/~ryov/bio-cgroup/)
>>>> to keep track of the right cgroup context. This approach could lead to more
>>>> memory consumption and increases the number of dirty pages (hard/slow to
>>>> reclaim pages) in the system, since dirty-page ratio in memory is not
>>>> limited. This could even lead to potential OOM conditions, but these problems
>>>> can be resolved directly into the memory cgroup subsystem
>>>>
>>>> * Handle I/O generated by kswapd: at the moment there's no control on the I/O
>>>> generated by kswapd; try to use the page_cgroup functionality of the memory
>>>> cgroup controller to track this kind of I/O and charge the right cgroup when
>>>> pages are swapped in/out
>>> FYI, this also can be done with bio-cgroup, which determine the owner cgroup
>>> of a given anonymous page.
>>>
>>> Thanks,
>>> Hirokazu Takahashi
>> That would be great! FYI here is how I would like to proceed:
>>
>> - today I'll post a new version of my cgroup-io-throttle patch rebased
>> to 2.6.27-rc5-mm1 (it's well tested and seems to be stable enough).
>> To keep the things light and simpler I've implemented custom
>> get_cgroup_from_page() / put_cgroup_from_page() in the memory
>> controller to retrieve the owner of a page, holding a reference to the
>> corresponding memcg, during async writes in submit_bio(); this is not
>> probably the best way to proceed, and a more generic framework like
>> bio-cgroup sounds better, but it seems to work quite well. The only
>> problem I've found is that during swap_writepage() the page is not
>> assigned to any page_cgroup (page_get_page_cgroup() returns NULL), and
>> so I'm not able to charge the cost of this I/O operation to the right
>> cgroup. Does bio-cgroup address or even resolve this issue?
>> - begin to implement a new branch of cgroup-io-throttle on top of
>> bio-cgroup
>> - also start to implement an additional request queue to provide first a
>> control at the cgroup level and a dispatcher to pass the request to
>> the elevator (as suggested by Vivek)
>>
>
> Hi Andrea,
>
> So if we maintain and rb-tree per request queue and implement the cgroup
> rules there, then that will take care of io-throttling also. (One can
> control the release of bio/requests to elevator based on any kind of
> rules. proportional weight/max-bandwidth).
>
> If that's the case, I was wondering what do you mean by "begin to
> implement new branch of cgroup-io-throttle" on top of bio-cgroup".
Correct, with the rb-tree per request queue solution there's no need to
keep track of the context in the struct bio, since the i/o control
based on per cgroup rules has been already performed by the first i/o
dispatcher. And I would really like to dedicate all my efforts to move
in this direction, but it would be interesting as well to test the
bio-cgroup functionality since it's working from now, it's a generic
framework and used by another project (dm-ioband). This is the reason
because I put it there, specifying to open a new branch, because it
would be an alternative solution to the following point.
-Andrea
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists