[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20090129.123955.102562289.ryov@valinux.co.jp>
Date: Thu, 29 Jan 2009 12:39:55 +0900 (JST)
From: Ryo Tsuruta <ryov@...inux.co.jp>
To: vgoyal@...hat.com
Cc: dm-devel@...hat.com, agk@...hat.com, linux-kernel@...r.kernel.org,
containers@...ts.linux-foundation.org, nauman@...gle.com,
dpshah@...gle.com, lizf@...fujitsu.com, mikew@...gle.com,
fchecconi@...il.com, paolo.valente@...more.it,
jens.axboe@...cle.com, fernando@...ellilink.co.jp,
s-uchida@...jp.nec.com, taka@...inux.co.jp,
guijianfeng@...fujitsu.com, arozansk@...hat.com, jmoyer@...hat.com,
riel@...hat.com, peterz@...radead.org, menage@...gle.com,
balbir@...ux.vnet.ibm.com, dhaval@...ux.vnet.ibm.com,
chrisw@...hat.com
Subject: Hierarchical grouping facility for IO controller (Re: [dm-devel]
[PATCH 1/2] dm-ioband: I/O bandwidth controller v1.10.0: Source code and
patch)
Hi Vivek,
This mail is about hierarchical grouping facility for IO controller.
> I think with the introduction of cgroup, IO scheduling has become now
> a hierarhical scheduling. Previously it was flat scheduling where there
> was only one level. Not there can be multiple levels and each level
> can have groups and queues. I don't think that we can break down
> hiearchical scheduling problem in two parts where top level part is moved
> into a module. It is something like saying that lets break out cpu group
> schedling into a separate module and it should not be part of kernel.
>
> I think we need to implement this hiearchical IO scheduler in kernel which
> can schedule groups as well as end level io queues. (maintained by cfq,
> deadline, as, or noop).
I can implement the hierarchical grouping facility if really necessary
and the patch will be released after dm-ioband is merged into the
kernel. But do you believe the hierarchical grouping facility should
be implemented in the kernel, even if we can do it in the userland?
I think disk bandwidth control doesn't need responsiveness like the
CPU scheduler. I know it's possible to implement in the kernel, but
does it only make the kernel complex?
> > > - Task grouping logic
> > > - We already have the notion of cgroup where tasks can be grouped
> > > in hierarhical manner. dm-ioband does not make full use of that and
> > > comes up with own mechansim of grouping tasks (apart from cgroup).
> > > And there are odd ways of specifying cgroup id which configuring the
> > > dm-ioband device. I think once somebody has created the cgroup
> > > hieararchy, any IO controller logic should be able to internally
> > > read that hiearchy and provide control. There should not be need
> > > of any other configuration utity on top of cgroup.
> > >
> > > My RFC patches had done that.
> >
> > Dm-ioband can work with the bio-cgroup mechanism, which makes task groups
> > in manner of the cgroup, of course.
> > I already have a basic design to make dm-ioband support the cgroup
> > hierarchy. This should be started after the core code of bio-cgroup,
> > which helps trace each I/O requests, is merged in -mm tree.
>
> bio-cgroup patches are fine because they provide us the capability to
> map delayed writes to right cgroup. And it can be used by any IO
> controller.
>
> > And the reason dm-ioband uses cgroup id to specify a cgroup is that
> > the current cgroup infrastructure lacks features to manage resources
> > placed in the kernel modules.
>
> Can you elaborate on that please? We have heard in the past that cgroup
> does not give you enough flexibility but never got details.
The current cgroup framework can't manage resources dynamically. The
reason for using a cgroup ID to specify a cgroup is that it makes the
implementation quite simple. Although it is possible to implement the
function without using the ID, it makes the kernel complex due to the
function has to be implemented outside of the kernel.
Thanks,
Ryo Tsuruta
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists