[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20090421.210614.112619728.ryov@valinux.co.jp>
Date: Tue, 21 Apr 2009 21:06:14 +0900 (JST)
From: Ryo Tsuruta <ryov@...inux.co.jp>
To: nauman@...gle.com
Cc: vgoyal@...hat.com, fernando@....ntt.co.jp,
linux-kernel@...r.kernel.org, jmoyer@...hat.com,
dm-devel@...hat.com, jens.axboe@...cle.com, agk@...hat.com,
balbir@...ux.vnet.ibm.com
Subject: Re: [dm-devel] Re: dm-ioband: Test results.
Hi Nauman,
> >> >> > General thoughts about dm-ioband
> >> >> > ================================
> >> >> > - Implementing control at second level has the advantage tha one does not
> >> >> > have to muck with IO scheduler code. But then it also has the
> >> >> > disadvantage that there is no communication with IO scheduler.
> >> >> >
> >> >> > - dm-ioband is buffering bio at higher layer and then doing FIFO release
> >> >> > of these bios. This FIFO release can lead to priority inversion problems
> >> >> > in certain cases where RT requests are way behind BE requests or
> >> >> > reader starvation where reader bios are getting hidden behind writer
> >> >> > bios etc. These are hard to notice issues in user space. I guess above
> >> >> > RT results do highlight the RT task problems. I am still working on
> >> >> > other test cases and see if i can show the probelm.
> >>
> >> Ryo, I could not agree more with Vivek here. At Google, we have very
> >> stringent requirement for latency of our RT requests. If RT requests
> >> get queued in any higher layer (behind BE requests), all bets are off.
> >> I don't find doing IO control at two layer for this particular reason.
> >> The upper layer (dm-ioband in this case) would have to make sure that
> >> RT requests are released immediately, irrespective of the state (FIFO
> >> queuing and tokens held). And the lower layer (IO scheduling layer)
> >> has to do the same. This requirement is not specific to us. I have
> >> seen similar comments from filesystem folks here previously, in the
> >> context of metadata updates being submitted as RT. Basically, the
> >> semantics of RT class has to be preserved by any solution that is
> >> build on top of CFQ scheduler.
> >
> > I could see the priority inversion by running Vivek's script and I
> > understand how RT requests has to be handled. I'll create a patch
> > which makes dm-ioband cooperates with CFQ scheduler. However, do you
> > think we need some kind of limitation on processes which belong to the
> > RT class to prevent the processes from depleting bandwidth?
>
> If you are talking about starvation that could be caused by RT tasks,
> you are right. We need some mechanism to introduce starvation
> prevention, but I think that is an issue that can be tackled once we
> decide where to do bandwidth control.
>
> The real question is, once you create a version of dm-ioband that
> co-operates with CFQ scheduler, how that solution would compare with
> the patch set Vivek has posted? In my opinion, we need to converge to
> one solution as soon as possible, so that we can work on it together
> to refine and test it.
I think I can do some help for your work. but I want to continue the
development of dm-ioband, because dm-ioband actually works well and
I think it has some advantages against other IO controllers.
- It can use without cgroup.
- It can control bandwidth on a per partition basis.
- The driver module can be replaced without stopping the system.
Thanks,
Ryo Tsuruta
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists