[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cf7f8f88-7d3e-8818-8584-e2276e7a1f30@huawei.com>
Date: Tue, 25 Oct 2022 10:08:10 +0100
From: John Garry <john.garry@...wei.com>
To: Ming Lei <ming.lei@...hat.com>
CC: <axboe@...nel.dk>, <linux-kernel@...r.kernel.org>,
<linux-block@...r.kernel.org>, <hch@....de>,
Bart Van Assche <bvanassche@....org>
Subject: Re: [PATCH] blk-mq: Properly init bios from
blk_mq_alloc_request_hctx()
On 25/10/2022 10:00, Ming Lei wrote:
>> My use case is in SCSI EH domain. For my HBA controller of interest, to
>> abort an erroneous IO we must send a controller proprietary abort
>> command on same HW queue as original command. So we would need to
>> allocate this abort request for a specific HW queue.
> IMO, it is one bad hw/sw interface.
>
> First such request has to be reserved, since all inflight IOs can be in error.
Right
>
> Second error handling needs to provide forward-progress, and it is supposed
> to not require external dependency, otherwise easy to cause deadlock, but
> here request from specific HW queue just depends on this queue's cpumask.
>
> Also if it has to be reserved, it can be done as one device/driver private
> command, so why bother blk-mq for this special use case?
I have a series for reserved request support, which I will send later.
Please have a look. And as I mentioned, I would prob not end up using
blk_mq_alloc_request_hctx() anyway.
>
>> I mentioned before that if no hctx->cpumask is online then we don't need
>> to allocate a request. That is because if no hctx->cpumask is online,
>> this means that original erroneous IO must be completed due to nature of
>> how blk-mq cpu hotplug handler works, i.e. drained, and then we don't
>> actually need to abort it any longer, so ok to not get a request.
> No, it is really not OK, if all cpus in hctx->cpumask are offline, you
> can't allocate
> request on the specified hw queue, then the erroneous IO can't be handled,
> then cpu hotplug handler may hang for ever.
If the erroneous IO is still in-flight from blk-mq perspective, then how
can hctx->cpumask still be offline? I thought that we guarantee that
hctx->cpumask cannot go offline until drained.
Thanks,
John
Powered by blists - more mailing lists