lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4489f80f-9f39-8f3f-5d10-6b113131e65c@opensource.wdc.com>
Date:   Tue, 20 Dec 2022 08:00:43 +0900
From:   Damien Le Moal <damien.lemoal@...nsource.wdc.com>
To:     John Garry <john.g.garry@...cle.com>,
        Jason Yan <yanaijie@...wei.com>,
        Xingui Yang <yangxingui@...wei.com>, jejb@...ux.ibm.com,
        martin.petersen@...cle.com, linux-ide@...r.kernel.org,
        hare@...e.com, hch@....de
Cc:     linux-scsi@...r.kernel.org, linux-kernel@...r.kernel.org,
        linuxarm@...wei.com, prime.zeng@...ilicon.com,
        kangfenglong@...wei.com
Subject: Re: [PATCH V2] scsi: libsas: Directly kick-off EH when ATA device
 fell off

On 12/20/22 00:55, John Garry wrote:
> On 19/12/2022 15:28, Jason Yan wrote:
>>>> +    if (test_bit(SAS_DEV_GONE, &dev->state) && dev_is_sata(dev))
>>>> +        sas_ata_device_link_abort(dev, false);
>>>
>>> Firstly, I think that there is a bug in sas_ata_device_link_abort() -> 
>>> ata_link_abort() code in that the host lock in not grabbed, as the 
>>> comment in ata_port_abort() mentions. Having said that, libsas had 
>>> already some dodgy host locking usage - specifically dropping the lock 
>>> for the queuing path (that's something else to be fixed up ... I think 
>>
>> Taking big locks in queuing path is not a good idea. This will bring 
>> down performance.
> 
> But it is expected that ata_qc_issue() should be called with that the 
> host lock grabbed (and keep it).
> 
> I think that the reason libsas drops the lock is because some LLDD 
> queuecommand CBs calls task_done() in some error paths. If we kept the 
> lock held, then we could have a deadlock, for example:
> 
> sas_ata_qc_issue (has lock) -> lldd_execute_task() = 
> pm8001_queue_command() -> task_done() = sas_ata_task_done() -> grab host 
> lock => deadlock.

That should be easily solvable using a workqueue for doing task_done(), no ?

> 
> Thanks,
> John

-- 
Damien Le Moal
Western Digital Research

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ