linux-kernel - Re: v4.15 and I/O hang with BFQ

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Tue, 30 Jan 2018 16:40:01 +0100
From:   Paolo Valente <paolo.valente@...aro.org>
To:     Ming Lei <ming.lei@...hat.com>
Cc:     Oleksandr Natalenko <oleksandr@...alenko.name>,
        Ivan Kozik <ivan@...ios.org>,
        linux-block <linux-block@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        'Paolo Valente' via bfq-iosched 
        <bfq-iosched@...glegroups.com>, Jens Axboe <axboe@...nel.dk>,
        Linus Walleij <linus.walleij@...aro.org>,
        SERENA ZIVIANI <169364@...denti.unimore.it>
Subject: Re: v4.15 and I/O hang with BFQ

> Il giorno 30 gen 2018, alle ore 15:40, Ming Lei <ming.lei@...hat.com> ha scritto:
> 
> On Tue, Jan 30, 2018 at 03:30:28PM +0100, Oleksandr Natalenko wrote:
>> Hi.
>> 
> ...
>>   systemd-udevd-271   [000] ....     4.311033: bfq_insert_requests: insert
>> rq->0
>>   systemd-udevd-271   [000] ...1     4.311037: blk_mq_do_dispatch_sched:
>> not get rq, 1
>>          cfdisk-408   [000] ....    13.484220: bfq_insert_requests: insert
>> rq->1
>>    kworker/0:1H-174   [000] ....    13.484253: blk_mq_do_dispatch_sched:
>> not get rq, 1
>> ===
>> 
>> Looks the same, right?
> 
> Yeah, same with before.
> 

Hi guys,
sorry for the delay with this fix.  We are proceeding very slowly on
this, because I'm super busy.  Anyway, now I can at least explain in
more detail the cause that leads to this hang.  Commit 'a6a252e64914
("blk-mq-sched: decide how to handle flush rq via RQF_FLUSH_SEQ")'
makes all non-flush re-prepared requests be re-inserted into the I/O
scheduler.  With this change, I/O schedulers may get the same request
inserted again, even several times, without a finish_request invoked
on the request before each re-insertion.

For the I/O scheduler, every such re-prepared request is equivalent
to the insertion of a new request. For schedulers like mq-deadline
or kyber this fact causes no problems. In contrast, it confuses a stateful
scheduler like BFQ, which preserves states for an I/O request until
finish_request is invoked on it. In particular, BFQ has no way
to know that the above re-insertions concerns the same, already dispatched
request. So it may get stuck waiting for the completion of these
re-inserted requests forever, thus preventing any other queue of
requests to be served.

We are trying to address this issue by adding the hook requeue_request
to bfq interface.

Unfortunately, with our current implementation of requeue_request in
place, bfq eventually gets to an incoherent state.  This is apparently
caused by a requeue of an I/O request, immediately followed by a
completion of the same request.  This seems rather absurd, and drives
bfq crazy.  But this is something for which we don't have definite
results yet.

We're working on it, sorry again for the delay.

Thanks,
Paolo

> -- 
> Ming