lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <234e04db-7539-07e4-a6b8-c6b05f78193d@opensource.wdc.com>
Date:   Tue, 20 Dec 2022 07:59:06 +0900
From:   Damien Le Moal <damien.lemoal@...nsource.wdc.com>
To:     Jason Yan <yanaijie@...wei.com>,
        John Garry <john.g.garry@...cle.com>,
        Xingui Yang <yangxingui@...wei.com>, jejb@...ux.ibm.com,
        martin.petersen@...cle.com, linux-ide@...r.kernel.org,
        hare@...e.com, hch@....de
Cc:     linux-scsi@...r.kernel.org, linux-kernel@...r.kernel.org,
        linuxarm@...wei.com, prime.zeng@...ilicon.com,
        kangfenglong@...wei.com
Subject: Re: [PATCH V2] scsi: libsas: Directly kick-off EH when ATA device
 fell off

On 12/20/22 00:28, Jason Yan wrote:
> On 2022/12/19 17:23, John Garry wrote:
>> On 16/12/2022 10:03, Xingui Yang wrote:
>>> If the ATA device fell off, call sas_ata_device_link_abort() directly and
>>> mark all outstanding QCs as failed and kick-off EH Immediately. This 
>>> avoids
>>> having to wait for block layer timeouts.
>>>
>>> Signed-off-by: Xingui Yang <yangxingui@...wei.com>
>>> ---
>>> Changes to v1:
>>> - Use dev_is_sata() to check ATA device type
>>>   drivers/scsi/libsas/sas_discover.c | 3 +++
>>>   1 file changed, 3 insertions(+)
>>>
>>> diff --git a/drivers/scsi/libsas/sas_discover.c 
>>> b/drivers/scsi/libsas/sas_discover.c
>>> index d5bc1314c341..a12b65eb4a2a 100644
>>> --- a/drivers/scsi/libsas/sas_discover.c
>>> +++ b/drivers/scsi/libsas/sas_discover.c
>>> @@ -362,6 +362,9 @@ static void sas_destruct_ports(struct asd_sas_port 
>>> *port)
>>>   void sas_unregister_dev(struct asd_sas_port *port, struct 
>>> domain_device *dev)
>>>   {
>>> +    if (test_bit(SAS_DEV_GONE, &dev->state) && dev_is_sata(dev))
>>> +        sas_ata_device_link_abort(dev, false);
>>
>> Firstly, I think that there is a bug in sas_ata_device_link_abort() -> 
>> ata_link_abort() code in that the host lock in not grabbed, as the 
>> comment in ata_port_abort() mentions. Having said that, libsas had 
>> already some dodgy host locking usage - specifically dropping the lock 
>> for the queuing path (that's something else to be fixed up ... I think 
> 
> Taking big locks in queuing path is not a good idea. This will bring 
> down performance.

With HDDs ? You will not see any difference (and SATA SSDs are not a thing
anymore, enough that we should worry too much. NVMe took over). And that
"big lock" is libata is really an integral part of the design. To remove
it, you will need to rewrite libata entirely...

> 
> 
>> that is due to queue command CB calling task_done() in some cases), but 
>> I still think that sas_ata_device_link_abort() should be fixed (to grab 
>> the host lock).
> 
> For sas_ata_device_link_abort(), it should grab ap->lock.

Which is what libata code comments (mistakenly in many places) always
refer as host lock.

> 
> Thanks,
> Jason

-- 
Damien Le Moal
Western Digital Research

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ