[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <48D267B5.20402@gmail.com>
Date: Thu, 18 Sep 2008 16:37:41 +0200
From: Andrea Righi <righi.andrea@...il.com>
To: Vivek Goyal <vgoyal@...hat.com>
CC: Ryo Tsuruta <ryov@...inux.co.jp>, linux-kernel@...r.kernel.org,
dm-devel@...hat.com, containers@...ts.linux-foundation.org,
virtualization@...ts.linux-foundation.org,
xen-devel@...ts.xensource.com, fernando@....ntt.co.jp,
balbir@...ux.vnet.ibm.com, xemul@...nvz.org, agk@...rceware.org,
jens.axboe@...cle.com
Subject: Re: dm-ioband + bio-cgroup benchmarks
Vivek Goyal wrote:
> On Thu, Sep 18, 2008 at 09:04:18PM +0900, Ryo Tsuruta wrote:
>> Hi All,
>>
>> I have got excellent results of dm-ioband, that controls the disk I/O
>> bandwidth even when it accepts delayed write requests.
>>
>> In this time, I ran some benchmarks with a high-end storage. The
>> reason was to avoid a performance bottleneck due to mechanical factors
>> such as seek time.
>>
>> You can see the details of the benchmarks at:
>> http://people.valinux.co.jp/~ryov/dm-ioband/hps/
>>
>
> Hi Ryo,
>
> I had a query about dm-ioband patches. IIUC, dm-ioband patches will break
> the notion of process priority in CFQ because now dm-ioband device will
> hold the bio and issue these to lower layers later based on which bio's
> become ready. Hence actual bio submitting context might be different and
> because cfq derives the io_context from current task, it will be broken.
>
> To mitigate that problem, we probably need to implement Fernando's
> suggestion of putting io_context pointer in bio.
>
> Have you already done something to solve this issue?
>
> Secondly, why do we have to create an additional dm-ioband device for
> every device we want to control using rules. This looks little odd
> atleast to me. Can't we keep it in line with rest of the controllers
> where task grouping takes place using cgroup and rules are specified in
> cgroup itself (The way Andrea Righi does for io-throttling patches)?
>
> To avoid creation of stacking another device (dm-ioband) on top of every
> device we want to subject to rules, I was thinking of maintaining an
> rb-tree per request queue. Requests will first go into this rb-tree upon
> __make_request() and then will filter down to elevator associated with the
> queue (if there is one). This will provide us the control of releasing
> bio's to elevaor based on policies (proportional weight, max bandwidth
> etc) and no need of stacking additional block device.
>
> I am working on some experimental proof of concept patches. It will take
> some time though.
>
> I was thinking of following.
>
> - Adopt the Andrea Righi's style of specifying rules for devices and
> group the tasks using cgroups.
>
> - To begin with, adopt dm-ioband's approach of proportional bandwidth
> controller. It makes sense to me limit the bandwidth usage only in
> case of contention. If there is really a need to limit max bandwidth,
> then probably we can do something to implement additional rules or
> implement some policy switcher where user can decide what kind of
> policies need to be implemented.
>
> - Get rid of dm-ioband and instead buffer requests on an rb-tree on every
> request queue which is controlled by some kind of cgroup rules.
>
> It would be good to discuss above approach now whether it makes sense or
> not. I think it is kind of fusion of io-throttling and dm-ioband patches
> with additional idea of doing io-control just above elevator on the request
> queue using an rb-tree.
Thanks Vivek. All sounds reasonable to me and I think this is be the right way
to proceed.
I'll try to design and implement your rb-tree per request-queue idea into my
io-throttle controller, maybe we can reuse it also for a more generic solution.
Feel free to send me your experimental proof of concept if you want, even if
it's not yet complete, I can review it, test and contribute.
-Andrea
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists