lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180905212659.GB21352@ming.t460p>
Date:   Thu, 6 Sep 2018 05:27:00 +0800
From:   Ming Lei <ming.lei@...hat.com>
To:     Jianchao Wang <jianchao.w.wang@...cle.com>
Cc:     axboe@...nel.dk, bart.vanassche@....com, sagi@...mberg.me,
        keith.busch@...el.com, jthumshirn@...e.de, jsmart2021@...il.com,
        linux-kernel@...r.kernel.org, linux-nvme@...ts.infradead.org,
        linux-block@...r.kernel.org
Subject: Re: [PATCH 0/3] Introduce a light-weight queue close feature

On Wed, Sep 05, 2018 at 12:09:43PM +0800, Jianchao Wang wrote:
> Dear all
> 
> As we know, queue freeze is used to stop new IO comming in and drain
> the request queue. And the draining queue here is necessary, because
> queue freeze kills the percpu-ref q_usage_counter and need to drain
> the q_usage_counter before switch it back to percpu mode. This could
> be a trouble when we just want to prevent new IO.
> 
> In nvme-pci, nvme_dev_disable freezes queues to prevent new IO.
> nvme_reset_work will unfreeze and wait to drain the queues. However,
> if IO timeout at the moment, no body could do recovery as nvme_reset_work
> is waiting. We will encounter IO hang.

As we discussed this nvme time issue before, I have pointed out that
this is because of blk_mq_unfreeze_queue()'s limit which requires that
unfreeze can only be done when this queue ref counter drops to zero.

For this nvme timeout case, we may relax the limit, for example,
introducing another API of blk_freeze_queue_stop() as counter-pair of
blk_freeze_queue_start(), and simply switch the percpu-ref to percpu mode
from atomic mode inside the new API.

> 
> So introduce a light-weight queue close feature in this patch set
> which could prevent new IO and needn't drain the queue.

Frankly speaking, IMO, it may not be an good idea to mess up the fast path
just for handling the extremely unusual timeout event. The same is true
for doing the preemp only stuff, as you saw I have posted patchset for
killing it.

Thanks,
Ming

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ