linux-kernel - Re: [PATCH V2] scsi: libsas: Directly kick-off EH when ATA device fell off

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <12cf34d6-8267-ac81-02c4-190bb9afc50b@oracle.com>
Date:   Tue, 20 Dec 2022 08:43:31 +0000
From:   John Garry <john.g.garry@...cle.com>
To:     Damien Le Moal <damien.lemoal@...nsource.wdc.com>,
        Jason Yan <yanaijie@...wei.com>,
        Xingui Yang <yangxingui@...wei.com>, jejb@...ux.ibm.com,
        martin.petersen@...cle.com, linux-ide@...r.kernel.org,
        hare@...e.com, hch@....de
Cc:     linux-scsi@...r.kernel.org, linux-kernel@...r.kernel.org,
        linuxarm@...wei.com, prime.zeng@...ilicon.com,
        kangfenglong@...wei.com
Subject: Re: [PATCH V2] scsi: libsas: Directly kick-off EH when ATA device
 fell off

On 19/12/2022 23:00, Damien Le Moal wrote:
>> But it is expected that ata_qc_issue() should be called with that the
>> host lock grabbed (and keep it).
>>
>> I think that the reason libsas drops the lock is because some LLDD
>> queuecommand CBs calls task_done() in some error paths. If we kept the
>> lock held, then we could have a deadlock, for example:
>>
>> sas_ata_qc_issue (has lock) -> lldd_execute_task() =
>> pm8001_queue_command() -> task_done() = sas_ata_task_done() -> grab host
>> lock => deadlock.
> That should be easily solvable using a workqueue for doing task_done(), no ?
> 

I don't see why we cannot just return an error code directly from the 
lldd_execute_task CB always - we end up calling scsi_done() directly 
then. But I am suspicious why it is not already done this way.

Looking at the code history, this fiddling with the ap->lock actually 
looks related to commit 312d3e56119a4bc5c36a96818f87f650c069ddc2 
("[SCSI] libsas: remove ata_port.lock management duties from lldds"). I 
will check that further.

Thanks,
John