[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4e5e476b0911100956u47bb3111jeba9fbcc3d992960@mail.gmail.com>
Date: Tue, 10 Nov 2009 18:56:01 +0100
From: Corrado Zoccolo <czoccolo@...il.com>
To: Jens Axboe <jens.axboe@...cle.com>
Cc: Linux-Kernel <linux-kernel@...r.kernel.org>,
Jeff Moyer <jmoyer@...hat.com>, aaronc@...ato.unsw.edu.au
Subject: Re: [RFC, PATCH] cfq-iosched: remove redundant queuing detection code
On Tue, Nov 10, 2009 at 4:14 PM, Jens Axboe <jens.axboe@...cle.com> wrote:
> On Tue, Nov 10 2009, Corrado Zoccolo wrote:
>> The core block layer already has code to detect presence of command
>> queuing devices. We convert cfq to use that instead of re-doing the
>> computation.
>
> There's is the major difference that the CFQ variant is dynamic and the
> block layer one is not. This change came from Aaron some time ago IIRC,
> see commit 45333d5. It's a bit of a chicken and egg problem.
The comment by Aaron:
CFQ's detection of queueing devices assumes a non-queuing device and detects
if the queue depth reaches a certain threshold. Under some workloads (e.g.
synchronous reads), CFQ effectively forces a unit queue depth,
thus defeating
the detection logic. This leads to poor performance on queuing hardware,
since the idle window remains enabled.
makes me think that the dynamic-off detection in cfq may really be
buggy (BTW this could explain the bad results on SSD Jeff observed
before my patch set).
The problem is, that once the hw_tag is 0, it is difficult for it to
become 1 again, as explained by Aaron, since cfq will hardly send more
than 1 request at a time. My patch set fixes this for SSDs (the seeky
readers will still be sent without idling, and if they are enough, the
logic will see a large enough depth to reconsider the initial
decision).
So the only sound way to do the detection is to start in an
indeterminate state, in which CFQ behaves as if hw_tag = 1, and then,
if for a long observation period we never saw large depth, we switch
to hw_tag = 0, otherwise we stick to hw_tag = 1, without reconsidering
it.
I think the correct logic could be pushed to the blk-core, by
introducing also an indeterminate bit.
Corrado
>
> --
> Jens Axboe
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists