linux-kernel - Re: [PATCH] blk-mq: Properly init bios from blk_mq_alloc_request

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <cf7f8f88-7d3e-8818-8584-e2276e7a1f30@huawei.com>
Date:   Tue, 25 Oct 2022 10:08:10 +0100
From:   John Garry <john.garry@...wei.com>
To:     Ming Lei <ming.lei@...hat.com>
CC:     <axboe@...nel.dk>, <linux-kernel@...r.kernel.org>,
        <linux-block@...r.kernel.org>, <hch@....de>,
        Bart Van Assche <bvanassche@....org>
Subject: Re: [PATCH] blk-mq: Properly init bios from
 blk_mq_alloc_request_hctx()

On 25/10/2022 10:00, Ming Lei wrote:
>> My use case is in SCSI EH domain. For my HBA controller of interest, to
>> abort an erroneous IO we must send a controller proprietary abort
>> command on same HW queue as original command. So we would need to
>> allocate this abort request for a specific HW queue.
> IMO, it is one bad hw/sw interface.
> 
> First such request has to be reserved, since all inflight IOs can be in error.

Right

> 
> Second error handling needs to provide forward-progress, and it is supposed
> to not require external dependency, otherwise easy to cause deadlock, but
> here request from specific HW queue just depends on this queue's cpumask.
> 
> Also if it has to be reserved, it can be done as one device/driver private
> command, so why bother blk-mq for this special use case?

I have a series for reserved request support, which I will send later. 
Please have a look. And as I mentioned, I would prob not end up using 
blk_mq_alloc_request_hctx() anyway.

> 
>> I mentioned before that if no hctx->cpumask is online then we don't need
>> to allocate a request. That is because if no hctx->cpumask is online,
>> this means that original erroneous IO must be completed due to nature of
>> how blk-mq cpu hotplug handler works, i.e. drained, and then we don't
>> actually need to abort it any longer, so ok to not get a request.
> No, it is really not OK, if all cpus in hctx->cpumask are offline, you
> can't allocate
> request on the specified hw queue, then the erroneous IO can't be handled,
> then cpu hotplug handler may hang for ever.

If the erroneous IO is still in-flight from blk-mq perspective, then how 
can hctx->cpumask still be offline? I thought that we guarantee that 
hctx->cpumask cannot go offline until drained.

Thanks,
John