[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20081126.214707.653026525707335397.ryov@valinux.co.jp>
Date: Wed, 26 Nov 2008 21:47:07 +0900 (JST)
From: Ryo Tsuruta <ryov@...inux.co.jp>
To: vgoyal@...hat.com
Cc: linux-kernel@...r.kernel.org,
containers@...ts.linux-foundation.org,
virtualization@...ts.linux-foundation.org, jens.axboe@...cle.com,
taka@...inux.co.jp, righi.andrea@...il.com, s-uchida@...jp.nec.com,
fernando@....ntt.co.jp, balbir@...ux.vnet.ibm.com,
akpm@...ux-foundation.org, menage@...gle.com, ngupta@...gle.com,
riel@...hat.com, jmoyer@...hat.com, peterz@...radead.org,
fchecconi@...il.com, paolo.valente@...more.it
Subject: Re: [patch 0/4] [RFC] Another proportional weight IO controller
Hi Vivek,
From: Vivek Goyal <vgoyal@...hat.com>
Subject: Re: [patch 0/4] [RFC] Another proportional weight IO controller
Date: Tue, 25 Nov 2008 11:27:20 -0500
> On Tue, Nov 25, 2008 at 11:33:59AM +0900, Ryo Tsuruta wrote:
> > Hi Vivek,
> >
> > > > > Ryo, do you still want to stick to two level scheduling? Given the problem
> > > > > of it breaking down underlying scheduler's assumptions, probably it makes
> > > > > more sense to the IO control at each individual IO scheduler.
> > > >
> > > > I don't want to stick to it. I'm considering implementing dm-ioband's
> > > > algorithm into the block I/O layer experimentally.
> > >
> > > Thanks Ryo. Implementing a control at block layer sounds like another
> > > 2 level scheduling. We will still have the issue of breaking underlying
> > > CFQ and other schedulers. How to plan to resolve that conflict.
> >
> > I think there is no conflict against I/O schedulers.
> > Could you expain to me about the conflict?
>
> Because we do the buffering at higher level scheduler and mostly release
> the buffered bios in the FIFO order, it might break the underlying IO
> schedulers. Generally it is the decision of IO scheduler to determine in
> what order to release buffered bios.
>
> For example, If there is one task of io priority 0 in a cgroup and rest of
> the tasks are of io prio 7. All the tasks belong to best effort class. If
> tasks of lower priority (7) do lot of IO, then due to buffering there is
> a chance that IO from lower prio tasks is seen by CFQ first and io from
> higher prio task is not seen by cfq for quite some time hence that task
> not getting it fair share with in the cgroup. Similiar situations can
> arise with RT tasks also.
Thanks for your explanation.
I think that the same thing occurs without the higher level scheduler,
because all the tasks issuing I/Os are blocked while the underlying
device's request queue is full before those I/Os are sent to the I/O
scheduler.
> > > What do you think about the solution at IO scheduler level (like BFQ) or
> > > may be little above that where one can try some code sharing among IO
> > > schedulers?
> >
> > I would like to support any type of block device even if I/Os issued
> > to the underlying device doesn't go through IO scheduler. Dm-ioband
> > can be made use of for the devices such as loop device.
> >
>
> What do you mean by that IO issued to underlying device does not go
> through IO scheduler? loop device will be associated with a file and
> IO will ultimately go to the IO scheduler which is serving those file
> blocks?
How about if the files is on an NFS-mounted file system?
> What's the use case scenario of doing IO control at loop device?
> Ultimately the resource contention will take place on actual underlying
> physical device where the file blocks are. Will doing the resource control
> there not solve the issue for you?
I don't come up with any use case, but I would like to make the
resource controller more flexible. Actually, a certain block device
that I'm using does not use the I/O scheduler.
Thanks,
Ryo Tsuruta
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists