[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20090129.123644.28802208.ryov@valinux.co.jp>
Date: Thu, 29 Jan 2009 12:36:44 +0900 (JST)
From: Ryo Tsuruta <ryov@...inux.co.jp>
To: vgoyal@...hat.com
Cc: dm-devel@...hat.com, agk@...hat.com, linux-kernel@...r.kernel.org,
containers@...ts.linux-foundation.org, nauman@...gle.com,
dpshah@...gle.com, lizf@...fujitsu.com, mikew@...gle.com,
fchecconi@...il.com, paolo.valente@...more.it,
jens.axboe@...cle.com, fernando@...ellilink.co.jp,
s-uchida@...jp.nec.com, taka@...inux.co.jp,
guijianfeng@...fujitsu.com, arozansk@...hat.com, jmoyer@...hat.com,
riel@...hat.com, peterz@...radead.org, menage@...gle.com,
balbir@...ux.vnet.ibm.com, dhaval@...ux.vnet.ibm.com,
chrisw@...hat.com
Subject: 2-Level IO scheduling (Re: [dm-devel] [PATCH 1/2] dm-ioband: I/O
bandwidth controller v1.10.0: Source code and patch)
Hi Vivek,
I split this mail thread into three topics:
o 2-Level IO scheduling
o Hierarchical grouping facility for IO controller
o Implement IO controller as a dm-driver
This mail is about 2-Level IO scheduling.
> Just because device mapper framework allows one to implement IO controller
> in a separate module, we should not implement it there. It will be
> difficult to take care of issues like, configuration, breaking underlying IO
> scheduler's assumptions, capability to treat tasks and groups at same level
> etc.
If you are satisfied with low-accuracy bandwidth control by an IO
scheduler, you don't need to use dm-ioband. If you want to use
dm-ioband with an IO scheduler, dm-ioband can work with any type of IO
scheduler, of course dm-ioband can work with your own IO scheduler
which you are developing.
> > > - If there is one task of io priority 0 in a cgroup and rest of the tasks
> > > are of io prio 7. All the tasks belong to best effort class. If tasks of
> > > lower priority (7) do lot of IO, then due to buffering there is a chance
> > > that IO from lower prio tasks is seen by CFQ first and io from higher prio
> > > task is not seen by cfq for quite some time hence that task not getting it
> > > fair share with in the cgroup. Similar situation can arise with RT tasks
> > > also.
> >
> > Whether using dm-ioband or not, if the tasks of IO priority 7 do lot
> > of IO, then the device queue is going to be full and tasks which tries
> > to issue IOs are blocked until the queue get a slot. The IOs are
> > backlogged even if they are issued from the task of IO priority 0.
> > I don't understand why you think it's the biggest issue. The same
> > thing is going to happen without dm-ioband.
> >
>
> True that even limited availability of request descriptors can be a
> bottleneck and can lead to same kind of issues but my contention is
> that you are aggravating the problem. Putting a 2nd layer can break IO
> scheduler's assumption even before underlying request queue is full.
I don't think so. Dm-ioband doesn't break IO scheduler's assumptions.
In CFQ's case, the priority order is not changed within a cgroup.
> So second level solution on top will increase the frequency of such
> incidents where a lower priority task can run away with more job done than
> high priority task because there are no separate queues for different
> priority tasks and release of buffered bio is FIFO.
>
> Secondly what happens to tasks of RT class? dm-ioband does not have any
> notion of handling the RT cgroup or RT tasks.
It's not an issue, it's a talk about how to determine a policy.
I think giving priority to cgroup policy rather than I/O scheduler
policy is more flexible.
> Thirdly, doing any kind of resource control at higher level takes away the
> capability to treat task and groups at same level. I have had this
> discussion in other offline thread also where you are copied. I think
> it is a good idea to treat tasks and groups at same level where possible
> (depends if IO scheduler creates separate queues for tasks or not, cfq
> does.)
>
> > If I were you, I create two cgroups and let tasks of lower priority
> > belong to one cgroup and tasks of higher priority belong to another,
> > and give higher bandwidth to the cgroup to which the higher priority
> > tasks belong. What do you think about this way?
>
> I think this is not practical. What we are talking is that task
> priority does not have any meaning. If we want service difference between
> two tasks, we need to pack them in separate cgroup otherwise we can't
> gurantee things. If we need to pack every task in separate cgroup then
> why to even have the notion of task priority.
It is possible to modify dm-ioband to cooperate with CFQ, but I'm not
sure it's really meaningful. What do you do when a task of RT class
issues a lot of I/O? Do you always give priority to the I/Os from the
task of RT class despite of the assigned bandwidth? Which one do you
give priority bandwidth or RT class?
Thanks,
Ryo Tsuruta
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists