[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4e5e476b0911091347t60e4d572kef2e632800fbf849@mail.gmail.com>
Date: Mon, 9 Nov 2009 22:47:48 +0100
From: Corrado Zoccolo <czoccolo@...il.com>
To: Vivek Goyal <vgoyal@...hat.com>
Cc: linux-kernel@...r.kernel.org, jens.axboe@...cle.com,
nauman@...gle.com, dpshah@...gle.com, lizf@...fujitsu.com,
ryov@...inux.co.jp, fernando@....ntt.co.jp, s-uchida@...jp.nec.com,
taka@...inux.co.jp, guijianfeng@...fujitsu.com, jmoyer@...hat.com,
balbir@...ux.vnet.ibm.com, righi.andrea@...il.com,
m-ikeda@...jp.nec.com, akpm@...ux-foundation.org, riel@...hat.com,
kamezawa.hiroyu@...fujitsu.com
Subject: Re: [RFC] Workload type Vs Groups (Was: Re: [PATCH 02/20] blkio:
Change CFQ to use CFS like queue time stamps)
On Fri, Nov 6, 2009 at 11:22 PM, Vivek Goyal <vgoyal@...hat.com> wrote:
> Hi All,
>
> I am now rebasing my patches to for-2.6.33 branch. There are significant
> number of changes in that branch, especially changes from corrado bring
> in an interesting question.
>
> Currently corrado has introduced the functinality of kind of grouping the
> cfq queues based on workload type and gives the time slots to these sub
> groups (sync-idle, sync-noidle, async).
>
> I was thinking of placing groups on top of this model, so that we select
> the group first and then select the type of workload and then finally
> the queue to run.
>
> Corrodo came up with an interesting suggestion (in a private mail), that
> what if we implement workload type at top and divide the share among
> groups with-in workoad type.
>
> So one would first select the workload to run and then select group
> with-in workload and then cfq queue with-in group.
>
> The advantage of this approach are.
>
> - for sync-noidle group, we will not idle per group. We will idle only
> only at root level. (Well if we don't idle on the group once it becomes
> empty, we will not see fairness for group. So it will be fairness vs
> throughput call).
>
> - It allows us to limit system wide share of workload type. So for
> example, one can kind of fix system wide share of async queues.
> Generally it might not be very prudent to allocate a group 50% of
> disk share and then that group decides to just do async IO and sync
> IO in rest of the groups suffer.
>
> Disadvantage
>
> - The definition of fairness becomes bit murkier. Now fairness will be
> achieved for a group with-in the workload type. So if a group is doing
> IO of type sync-idle as well as sync-noidle and other group is doing
> IO of type only sync-noidle, then first group will get overall more
> disk time even if both the groups have same weight.
The fairness definition was always debated (disk time vs data transferred).
I think that the two have both some reason to exist.
Namely, disk time is good for sync-idle workloads, like sequential readers,
while data transferred is good for sync-noidle workloads, like random readers.
Unfortunately, the two measures seems not comparable, so we seem
obliged to schedule independently the two kinds of workloads.
Actually, I think we can compute a feedback from each scheduling turn,
that can be used to temporary alter weights in next turn, in order to
reach long term fairness.
Thanks,
Corrado
>
> Looking for some feedback about which appraoch makes more sense before I
> write patches.
>
> Thanks
> Vivek
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists