linux-kernel - Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra scheduler

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f71cb330-8e60-694e-0494-43ad8bc4b91b@kernel.dk>
Date:   Fri, 28 Oct 2016 08:07:35 -0600
From:   Jens Axboe <axboe@...nel.dk>
To:     Linus Walleij <linus.walleij@...aro.org>
Cc:     Ulf Hansson <ulf.hansson@...aro.org>,
        Paolo Valente <paolo.valente@...aro.org>,
        Christoph Hellwig <hch@...radead.org>,
        Arnd Bergmann <arnd@...db.de>,
        Bart Van Assche <bart.vanassche@...disk.com>,
        Jan Kara <jack@...e.cz>, Tejun Heo <tj@...nel.org>,
        linux-block@...r.kernel.org,
        Linux-Kernal <linux-kernel@...r.kernel.org>,
        Mark Brown <broonie@...nel.org>,
        Hannes Reinecke <hare@...e.de>,
        Grant Likely <grant.likely@...retlab.ca>,
        James Bottomley <James.Bottomley@...senpartnership.com>
Subject: Re: [PATCH 00/14] introduce the BFQ-v0 I/O scheduler as an extra
 scheduler

On 10/27/2016 04:27 PM, Linus Walleij wrote:
> On Thu, Oct 27, 2016 at 11:08 PM, Jens Axboe <axboe@...nel.dk> wrote:
>
>> blk-mq has evolved to support a variety of devices, there's nothing
>> special about mmc that can't work well within that framework.
>
> There is. Read mmc_queue_thread() in drivers/mmc/card/queue.c
>
> This repeatedly calls req = blk_fetch_request(q);, starting one request
> and then getting the next one off the queue, including reading
> a few NULL requests off the end of the queue (to satisfy the
> semantics of its state machine.
>
> It then preprocess each request by esstially calling .pre() and .post()
> hooks all the way down to the driver, flushing its mapped
> sglist from CPU to DMA device memory (not a problem on x86 and
> other DMA-coherent archs, but a big win on the incoherent ones).
>
> In the attempt that was posted recently this is achieved by lying
> and saying the HW queue is two items deep and eating requests
> off that queue calling pre/post on them.
>
> But as there actually exist MMC cards with command queueing, this
> would become hopeless to handle, the hw queue depth has to reflect
> the real depth. What we need is for the block core to call pre/post
> hooks on each request.
>
> The "only" thing that doesn't work well after that is that CFQ is no
> longer in action, which will have interesting effects on MMC throughput
> in any fio-like stress test as it is mostly single-hw-queue.

That will cause you pain with any IO scheduler that has more complex
state, like CFQ and BFQ... I looked at the code but I don't quite get
why it is handling requests like that. Care to expand? Is it a
performance optimization? It looks fairly convoluted for some reason. I
would imagine that latency would be one of the more important aspects
for mmc, yet the driver has a context switch for each sync IO.

-- 
Jens Axboe