linux-kernel - Re: [PATCH V3 00/11] block-throttle: add .high limit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <579cd85c-ff72-53cc-ece0-5e264ff3451c@gmail.com>
Date:   Thu, 6 Oct 2016 11:10:52 -0400
From:   "Austin S. Hemmelgarn" <ahferroin7@...il.com>
To:     Paolo Valente <paolo.valente@...more.it>
Cc:     Mark Brown <broonie@...nel.org>,
        Linus Walleij <linus.walleij@...aro.org>,
        Tejun Heo <tj@...nel.org>, Shaohua Li <shli@...com>,
        Vivek Goyal <vgoyal@...hat.com>, linux-block@...r.kernel.org,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Jens Axboe <axboe@...com>, Kernel-team@...com,
        jmoyer@...hat.com, Ulf Hansson <ulf.hansson@...aro.org>,
        Hannes Reinecke <hare@...e.com>
Subject: Re: [PATCH V3 00/11] block-throttle: add .high limit

On 2016-10-06 11:05, Paolo Valente wrote:
>
>> Il giorno 06 ott 2016, alle ore 15:52, Austin S. Hemmelgarn <ahferroin7@...il.com> ha scritto:
>>
>> On 2016-10-06 08:50, Paolo Valente wrote:
>>>
>>>> Il giorno 06 ott 2016, alle ore 13:57, Austin S. Hemmelgarn <ahferroin7@...il.com> ha scritto:
>>>>
>>>> On 2016-10-06 07:03, Mark Brown wrote:
>>>>> On Thu, Oct 06, 2016 at 10:04:41AM +0200, Linus Walleij wrote:
>>>>>> On Tue, Oct 4, 2016 at 9:14 PM, Tejun Heo <tj@...nel.org> wrote:
>>>>>
>>>>>>> I get that bfq can be a good compromise on most desktop workloads and
>>>>>>> behave reasonably well for some server workloads with the slice
>>>>>>> expiration mechanism but it really isn't an IO resource partitioning
>>>>>>> mechanism.
>>>>>
>>>>>> Not just desktops, also Android phones.
>>>>>
>>>>>> So why not have BFQ as a separate scheduling policy upstream,
>>>>>> alongside CFQ, deadline and noop?
>>>>>
>>>>> Right.
>>>>>
>>>>>> We're already doing the per-usecase Kconfig thing for preemption.
>>>>>> But maybe somebody already hates that and want to get rid of it,
>>>>>> I don't know.
>>>>>
>>>>> Hannes also suggested going back to making BFQ a separate scheduler
>>>>> rather than replacing CFQ earlier, pointing out that it mitigates
>>>>> against the risks of changing CFQ substantially at this point (which
>>>>> seems to be the biggest issue here).
>>>>>
>>>> ISTR that the original argument for this approach essentially amounted to: 'If it's so much better, why do we need both?'.
>>>>
>>>> Such an argument is valid only if the new design is better in all respects (which there isn't sufficient information to decide in this case), or the negative aspects are worth the improvements (which is too workload specific to decide for something like this).
>>>
>>> All correct, apart from the workload-specific issue, which is not very clear to me. Over the last five years I have not found a single workload for which CFQ is better than BFQ, and none has been suggested.
>> My point is that whether or not BFQ is better depends on the workload. You can't test for every workload, so you can't say definitively that BFQ is better for every workload.
>
> Yes
>
>>  At a minimum, there are workloads where the deadline and noop schedulers are better, but they're very domain specific workloads.
>
> Definitely
>
>>  Based on the numbers from Shaohua, it looks like CFQ has better throughput than BFQ, and that will affect some workloads (for most, the improved fairness is worth the reduced throughput, but there probably are some cases where it isn't).
>
> Well, no fairness as deadline and noop, but with much less throughput
> than deadline and noop, doesn't sound much like the best scheduler for
> those workloads.  With BFQ you have service guarantees, with noop or
> deadline you have maximum throughput.
And with CFQ you have something in between, which is half of why I think 
CFQ is still worth keeping (the other half being the people who 
inevitably want to stay on CFQ).  And TBH, deadline and noop only give 
good throughput with specific workloads (and in the case of noop, it's 
usually only useful on tiny systems where the overhead of scheduling is 
greater than the time saved by doing so (like some very low power 
embedded systems), or when you have scheduling done elsewher in the 
storage stack (like in a VM)).
>
>>>
>>> Anyway, leaving aside this fact, IMO the real problem here is that we are in a catch-22: "we want BFQ to replace CFQ, but, since CFQ is legacy code, then you cannot change, and thus replace, CFQ"
>> I agree that that's part of the issue, but I also don't entirely agree with the reasoning on it.  Until blk-mq has proper I/O scheduling, people will continue to use CFQ, and based on the way things are going, it will be multiple months before that happens, whereas BFQ exists and is working now.
>
> Exactly!
>
> Thanks,
> Paolo
>