linux-kernel - Re: v4.15 and I/O hang with BFQ

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-Id: <6BCE8A16-7D00-4B55-8E2D-0DDC7C187C14@linaro.org>
Date:   Mon, 5 Feb 2018 20:07:03 +0100
From:   Paolo Valente <paolo.valente@...aro.org>
To:     bfq-iosched@...glegroups.com
Cc:     Ming Lei <ming.lei@...hat.com>,
        Oleksandr Natalenko <oleksandr@...alenko.name>,
        Ivan Kozik <ivan@...ios.org>,
        linux-block <linux-block@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Jens Axboe <axboe@...nel.dk>,
        Linus Walleij <linus.walleij@...aro.org>,
        SERENA ZIVIANI <169364@...denti.unimore.it>
Subject: Re: v4.15 and I/O hang with BFQ



> Il giorno 30 gen 2018, alle ore 16:40, Paolo Valente <paolo.valente@...aro.org> ha scritto:
> 
> 
> 
>> Il giorno 30 gen 2018, alle ore 15:40, Ming Lei <ming.lei@...hat.com> ha scritto:
>> 
>> On Tue, Jan 30, 2018 at 03:30:28PM +0100, Oleksandr Natalenko wrote:
>>> Hi.
>>> 
>> ...
>>>  systemd-udevd-271   [000] ....     4.311033: bfq_insert_requests: insert
>>> rq->0
>>>  systemd-udevd-271   [000] ...1     4.311037: blk_mq_do_dispatch_sched:
>>> not get rq, 1
>>>         cfdisk-408   [000] ....    13.484220: bfq_insert_requests: insert
>>> rq->1
>>>   kworker/0:1H-174   [000] ....    13.484253: blk_mq_do_dispatch_sched:
>>> not get rq, 1
>>> ===
>>> 
>>> Looks the same, right?
>> 
>> Yeah, same with before.
>> 
> 
> Hi guys,
> sorry for the delay with this fix.  We are proceeding very slowly on
> this, because I'm super busy.  Anyway, now I can at least explain in
> more detail the cause that leads to this hang.  Commit 'a6a252e64914
> ("blk-mq-sched: decide how to handle flush rq via RQF_FLUSH_SEQ")'
> makes all non-flush re-prepared requests be re-inserted into the I/O
> scheduler.  With this change, I/O schedulers may get the same request
> inserted again, even several times, without a finish_request invoked
> on the request before each re-insertion.
> 
> For the I/O scheduler, every such re-prepared request is equivalent
> to the insertion of a new request. For schedulers like mq-deadline
> or kyber this fact causes no problems. In contrast, it confuses a stateful
> scheduler like BFQ, which preserves states for an I/O request until
> finish_request is invoked on it. In particular, BFQ has no way
> to know that the above re-insertions concerns the same, already dispatched
> request. So it may get stuck waiting for the completion of these
> re-inserted requests forever, thus preventing any other queue of
> requests to be served.
> 
> We are trying to address this issue by adding the hook requeue_request
> to bfq interface.
> 
> Unfortunately, with our current implementation of requeue_request in
> place, bfq eventually gets to an incoherent state.  This is apparently
> caused by a requeue of an I/O request, immediately followed by a
> completion of the same request.  This seems rather absurd, and drives
> bfq crazy.  But this is something for which we don't have definite
> results yet.
> 
> We're working on it, sorry again for the delay.
> 

Ok, patch arriving ... Please test it.

Thanks,
Paolo

> Thanks,
> Paolo
> 
>> -- 
>> Ming
> 
> -- 
> You received this message because you are subscribed to the Google Groups "bfq-iosched" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to bfq-iosched+unsubscribe@...glegroups.com.
> For more options, visit https://groups.google.com/d/optout.