lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <73a430da-84c8-5457-108a-7e1e2d81fa61@gmail.com>
Date:   Fri, 20 Aug 2021 08:27:48 -0700
From:   James Smart <jsmart2021@...il.com>
To:     Daniel Wagner <dwagner@...e.de>, linux-nvme@...ts.infradead.org
Cc:     linux-kernel@...r.kernel.org,
        James Smart <james.smart@...adcom.com>,
        Keith Busch <kbusch@...nel.org>,
        Ming Lei <ming.lei@...hat.com>,
        Sagi Grimberg <sagi@...mberg.me>,
        Hannes Reinecke <hare@...e.de>,
        Wen Xiong <wenxiong@...ibm.com>,
        Himanshu Madhani <himanshu.madhani@...cle.com>
Subject: Re: [PATCH v5 0/3] Handle update hardware queues and queue freeze
 more carefully

On 8/20/2021 4:55 AM, Daniel Wagner wrote:
> On Fri, Aug 20, 2021 at 10:48:32AM +0200, Daniel Wagner wrote:
>> Then we try to do the same thing again which fails, thus we never
>> make progress.
>>
>> So clearly we need to update number of queues at one point. What would
>> be the right thing to do here? As I understood we need to be careful
>> with frozen requests. Can we abort them (is this even possible in this
>> state?) and requeue them before we update the queue numbers?
> 
> After starring a bit longer at the reset path, I think there is no
> pending request in any queue. nvme_fc_delete_association() calls
> __nvme_fc_abort_outstanding_ios() which makes sure all queues are
> drained (usage counter is 0). Also it clears the NVME_FC_Q_LIVE bit,
> which prevents further request added to queues.

yes, as long as we haven't attempted to create the io queues via 
nvme_fc_connect_io_queues(), nothing should be successful queueing and 
running down the hctx to start the io. nvme_fc_connect_io_queues() will 
use the queue for the Connect cmd, which is probably what generated the 
prior -16389 error.

Which says:"nvme-fc: Update hardware queues before using them" should be 
good to use.

> 
> I start wonder why we have to do the nvme_start_freeze() in the first
> place and why we want to wait for the freeze. 88e837ed0f1f ("nvme-fc:
> wait for queues to freeze before calling update_hr_hw_queues") doesn't
> really tell why we need wait for the freeze.

I think that is probably going to be true as well - no need to 
freeze/unfreeze around this path.  This was also a rather late add (last 
oct), so we had been running without the freezes for a long time, 
granted few devices change their queue counts.

I'll have to see if I can find what prompted the change. At first blush, 
I'm fine reverting it.

> 
> Given we know the usage counter of the queues is 0, I think we are
> safe to move the blk_mq_update_nr_hw_queues() before the start queue
> code. Also note nvme_fc_create_hw_io_queues() calls
> blk_mq_freeze_queue() but it wont block as we are sure there is no
> pending request.

Agree.

-- james

> 
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme@...ts.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ