[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4e5e476b0909030947m2423d0cdr5fd2b3de261b2da6@mail.gmail.com>
Date: Thu, 3 Sep 2009 18:47:46 +0200
From: Corrado Zoccolo <czoccolo@...il.com>
To: Jeff Moyer <jmoyer@...hat.com>
Cc: Linux-Kernel <linux-kernel@...r.kernel.org>,
Jens Axboe <jens.axboe@...cle.com>
Subject: Re: [RFC] cfq: adapt slice to number of processes doing I/O
Hi Jeff,
can you share the benchmark?
I think I have to fix the min slice to consider priority, too, to
respect the priorities when there are many processes.
For the fairness at a single priority level, my tests show that
fairness is improved with the patches (comparing minimum and maximum
bandwidth for a set of 32 processes):
Original:
Run status group 0 (all jobs):
READ: io=14192KiB, aggrb=480KiB/s, minb=7KiB/s, maxb=20KiB/s,
mint=30001msec, maxt=30258msec
Run status group 1 (all jobs):
READ: io=829292KiB, aggrb=27816KiB/s, minb=723KiB/s,
maxb=1004KiB/s, mint=30004msec, maxt=30529msec
Adaptive:
Run status group 0 (all jobs):
READ: io=14444KiB, aggrb=488KiB/s, minb=12KiB/s, maxb=17KiB/s,
mint=30003msec, maxt=30298msec
Run status group 1 (all jobs):
READ: io=721324KiB, aggrb=24140KiB/s, minb=689KiB/s, maxb=795KiB/s,
mint=30003msec, maxt=30598msec
Are you using random think times? This could explain the discrepancy.
Corrado
On Thu, Sep 3, 2009 at 5:38 PM, Jeff Moyer<jmoyer@...hat.com> wrote:
> Jeff Moyer <jmoyer@...hat.com> writes:
>
>> Corrado Zoccolo <czoccolo@...il.com> writes:
>>
>>> When the number of processes performing I/O concurrently increases, a
>>> fixed time slice per process will cause large latencies.
>>> In the patch, if there are more than 3 processes performing concurrent
>>> I/O, we scale the time slice down proportionally.
>>> To safeguard sequential bandwidth, we impose a minimum time slice,
>>> computed from cfq_slice_idle (the idea is that cfq_slice_idle
>>> approximates the cost for a seek).
>>>
>>> I performed two tests, on a rotational disk:
>>> * 32 concurrent processes performing random reads
>>> ** the bandwidth is improved from 466KB/s to 477KB/s
>>> ** the maximum latency is reduced from 7.667s to 1.728
>>> * 32 concurrent processes performing sequential reads
>>> ** the bandwidth is reduced from 28093KB/s to 24393KB/s
>>> ** the maximum latency is reduced from 3.781s to 1.115s
>>>
>>> I expect numbers to be even better on SSDs, where the penalty to
>>> disrupt sequential read is much less.
>>
>> Interesting approach. I'm not sure what the benefits will be on SSDs,
>> as the idling logic is disabled for them (when nonrot is set and they
>> support ncq). See cfq_arm_slice_timer.
>>
>>> Signed-off-by: Corrado Zoccolo <czoccolo@...il-com>
>>>
>>> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
>>> index fd7080e..cff4ca8 100644
>>> --- a/block/cfq-iosched.c
>>> +++ b/block/cfq-iosched.c
>>> @@ -306,7 +306,15 @@ cfq_prio_to_slice(struct cfq_data *cfqd, struct
>>> cfq_queue *cfqq)
>>> static inline void
>>> cfq_set_prio_slice(struct cfq_data *cfqd, struct cfq_queue *cfqq)
>>> {
>>> - cfqq->slice_end = cfq_prio_to_slice(cfqd, cfqq) + jiffies;
>>> + unsigned low_slice = cfqd->cfq_slice_idle * (1 + cfq_cfqq_sync(cfqq));
>>> + unsigned interested_queues = cfq_class_rt(cfqq) ?
>>> cfqd->busy_rt_queues : cfqd->busy_queues;
>>
>> Either my mailer displayed this wrong, or yours wraps lines.
>>
>>> + unsigned slice = cfq_prio_to_slice(cfqd, cfqq);
>>> + if (interested_queues > 3) {
>>> + slice *= 3;
>>
>> How did you come to this magic number of 3, both for the number of
>> competing tasks and the multiplier for the slice time? Did you
>> experiment with this number at all?
>>
>>> + slice /= interested_queues;
>>
>> Of course you realize this could disable the idling logic completely,
>> right? I'll run this patch through some tests and let you know how it
>> goes.
>
> I missed that you updated the slice end based on a max of slice and
> low_slice. Sorry about that.
>
> This patch does not fare well when judging fairness between processes.
> I have several fio jobs that generate read workloads, and I try to
> figure out whether the I/O scheduler is providing fairness based on the
> I/O priorities of the processes. With your patch applied, we get the
> following results:
>
> total priority: 880
> total data transferred: 1045920
> class prio ideal xferred %diff
> be 0 213938 352500 64
> be 1 190167 193012 1
> be 2 166396 123380 -26
> be 3 142625 86260 -40
> be 4 118854 62964 -48
> be 5 95083 40180 -58
> be 6 71312 74484 4
> be 7 47541 113140 137
>
> Class and prio should be self-explanatory. ideal is my cooked up
> version of the ideal number of bytes the given priority should have
> transferred based on the total data transferred and all processes
> weighted by priority competing for the disk. xferred is the actual
> amount of data transferred, and %diff is the difference between those
> last two columns.
>
> Notice that best effort priority 7 managed to transfer more data than be
> prio 3. That's bad. Now, let's look at 8 processes all at the same
> priority level:
>
> total priority: 800
> total data transferred: 1071036
> class prio ideal xferred %diff
> be 4 133879 222452 66
> be 4 133879 243188 81
> be 4 133879 187380 39
> be 4 133879 42512 -69
> be 4 133879 39156 -71
> be 4 133879 47604 -65
> be 4 133879 37364 -73
> be 4 133879 251380 87
>
> Hmm. That doesn't look good.
>
> For comparison, here is the output from the vanilla kernel for those two
> runs:
>
> total priority: 880
> total data transferred: 954272
> class prio ideal xferred %diff
> be 0 195192 229108 17
> be 1 173504 202740 16
> be 2 151816 156660 3
> be 3 130128 152052 16
> be 4 108440 91636 -16
> be 5 86752 64244 -26
> be 6 65064 34292 -48
> be 7 43376 23540 -46
>
> total priority: 800
> total data transferred: 887264
> class prio ideal xferred %diff
> be 4 110908 124404 12
> be 4 110908 123380 11
> be 4 110908 118004 6
> be 4 110908 113396 2
> be 4 110908 107252 -4
> be 4 110908 98356 -12
> be 4 110908 96244 -14
> be 4 110908 106228 -5
>
> It's worth noting that the overall throughput went up in the patched
> kernel for this second case. However, if we care at all about the
> notion of I/O priorities, I think your patch needs more work.
>
> Cheers,
> Jeff
>
--
__________________________________________________________________________
dott. Corrado Zoccolo mailto:czoccolo@...il.com
PhD - Department of Computer Science - University of Pisa, Italy
--------------------------------------------------------------------------
The self-confidence of a warrior is not the self-confidence of the average
man. The average man seeks certainty in the eyes of the onlooker and calls
that self-confidence. The warrior seeks impeccability in his own eyes and
calls that humbleness.
Tales of Power - C. Castaneda
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists