linux-kernel - Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1516296056.2676.23.camel@wdc.com>
Date:   Thu, 18 Jan 2018 17:20:57 +0000
From:   Bart Van Assche <Bart.VanAssche@....com>
To:     "snitzer@...hat.com" <snitzer@...hat.com>
CC:     "dm-devel@...hat.com" <dm-devel@...hat.com>,
        "hch@...radead.org" <hch@...radead.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "linux-block@...r.kernel.org" <linux-block@...r.kernel.org>,
        "osandov@...com" <osandov@...com>,
        "ming.lei@...hat.com" <ming.lei@...hat.com>,
        "axboe@...nel.dk" <axboe@...nel.dk>
Subject: Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle

On Thu, 2018-01-18 at 12:03 -0500, Mike Snitzer wrote:
> On Thu, Jan 18 2018 at 11:50am -0500,
> Bart Van Assche <bart.vanassche@....com> wrote:
> > My comments about the above are as follows:
> > - It can take up to q->rq_timeout jiffies after a .queue_rq()
> >   implementation returned BLK_STS_RESOURCE before blk_mq_timeout_work()
> >   gets called. However, it can happen that only a few milliseconds after
> >   .queue_rq() returned BLK_STS_RESOURCE that the condition that caused
> >   it to return BLK_STS_RESOURCE gets cleared. So the above approach can
> >   result in long delays during which it will seem like the queue got
> >   stuck. Additionally, I think that the block driver should decide how
> >   long it takes before a queue is rerun and not the block layer core.
> 
> So configure q->rq_timeout to be shorter?  Which is configurable though
> blk_mq_tag_set's 'timeout' member.  It apparently defaults to 30 * HZ.
> 
> That is the problem with timeouts, there is generally no one size fits
> all.

Sorry but I think that would be wrong. The delay after which a queue is rerun
should not be coupled to the request timeout. These two should be independent.

> > - The lockup that I reported only occurs with the dm driver but not any
> >   other block driver. So why to modify the block layer core since this
> >   can be fixed by modifying the dm driver?
> 
> Hard to know it is only DM's blk-mq that is impacted.  That is the only
> blk-mq driver that you're testing like this (that is also able to handle
> faults, etc).

That's not correct. I'm also testing the SCSI core, which is one of the most
complicated block drivers.

> > - A much simpler fix and a fix that is known to work exists, namely
> >   inserting a blk_mq_delay_run_hw_queue() call in the dm driver.
> 
> Because your "much simpler" fix actively hurts performance, as is
> detailed in this header:
> https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-4.16&id=ec3eaf9a673106f66606896aed6ddd20180b02ec

We are close to the start of the merge window so I think it's better to fall
back to an old approach that is known to work than to keep a new approach
that is known not to work. Additionally, the performance issue you referred
to only affects IOPS and bandwidth more than 1% with the lpfc driver and that
is because the queue depth it supports is much lower than for other SCSI HBAs,
namely 3 instead of 64.

Thanks,

Bart.