lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 21 Dec 2022 17:31:59 +0900
From:   Damien Le Moal <damien.lemoal@...nsource.wdc.com>
To:     Jason Yan <yanaijie@...wei.com>,
        John Garry <john.g.garry@...cle.com>,
        Xingui Yang <yangxingui@...wei.com>, jejb@...ux.ibm.com,
        martin.petersen@...cle.com, niklas.cassel@....com
Cc:     linux-scsi@...r.kernel.org, linux-kernel@...r.kernel.org,
        linuxarm@...wei.com, prime.zeng@...ilicon.com,
        kangfenglong@...wei.com
Subject: Re: [PATCH] scsi: libsas: Grab the host lock in
 sas_ata_device_link_abort()

On 12/21/22 15:35, Jason Yan wrote:
> On 2022/12/21 11:59, Damien Le Moal wrote:
>> On 2022/12/21 11:42, Jason Yan wrote:
>>> On 2022/12/21 8:36, Damien Le Moal wrote:
>>>> On 2022/12/20 23:59, John Garry wrote:
>>>>> On 20/12/2022 12:53, Xingui Yang wrote:
>>>>>> Grab the host lock in sas_ata_device_link_abort() before calling
>>>>>
>>>>> This is should be the ata port lock, right? I know that the ata
>>>>> comments
>>>>> say differently.
>>>>>
>>>>>> ata_link_abort(), as the comment in ata_link_abort() mentions.
>>>>>>
>>>>>
>>>>> Can you please add a fixes tag?
>>>>>
>>>>>> Signed-off-by: Xingui Yang <yangxingui@...wei.com>
>>>>>
>>>>> Reviewed-by: John Garry <john.g.garry@...cle.com>
>>>>>
>>>>>> ---
>>>>>>     drivers/scsi/libsas/sas_ata.c | 3 +++
>>>>>>     1 file changed, 3 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/scsi/libsas/sas_ata.c
>>>>>> b/drivers/scsi/libsas/sas_ata.c
>>>>>> index f7439bf9cdc6..4f2017b21e6d 100644
>>>>>> --- a/drivers/scsi/libsas/sas_ata.c
>>>>>> +++ b/drivers/scsi/libsas/sas_ata.c
>>>>>> @@ -889,7 +889,9 @@ void sas_ata_device_link_abort(struct
>>>>>> domain_device *device, bool force_reset)
>>>>>>     {
>>>>>>         struct ata_port *ap = device->sata_dev.ap;
>>>>>>         struct ata_link *link = &ap->link;
>>>>>> +    unsigned long flags;
>>>>>>     +    spin_lock_irqsave(ap->lock, flags);
>>>>>>         device->sata_dev.fis[2] = ATA_ERR | ATA_DRDY; /* tf status */
>>>>>>         device->sata_dev.fis[3] = ATA_ABORTED; /* tf error */
>>>>>>     @@ -897,6 +899,7 @@ void sas_ata_device_link_abort(struct
>>>>>> domain_device *device, bool force_reset)
>>>>>>         if (force_reset)
>>>>>>             link->eh_info.action |= ATA_EH_RESET;
>>>>>>         ata_link_abort(link);
>>>>
>>>> Really need to add lockdep annotations in libata to avoid/catch such
>>>> bugs...
>>>> Will work on that.
>>>
>>> Actually in libata there are many places calling ata_link_abort() not
>>> holding port lock. And some places are holding the real host
>>> lock(ata_host->lock) while calling ata_link_abort(). So if you add the
>>> lockdep annotations, there may be too many warnings. If these are real
>>> issues, we should fix them first.
>>
>> libata-EH does most of its work without the port lock held because by
>> the time
>> we get EH started, we are guaranteed to be idle with no commands in
>> flight. That
>> is why the calls you mention look like "bugs" but are not.
> 
> What about the interrupt handler such as ahci_error_intr()? I didn't see
> the callers hold the port lock too. Do they need the port lock?

It looks like it is missing for ahci_thunderx_irq_handler() but that one
takes the host lock. Same for xgene_ahci_irq_intr(), again no port lock
but host lock taken. And again for ahci_single_level_irq_intr() for the
non MSI case. For modern MSI adapters, the port lock is taken in

For other cases, ahci_multi_irqs_intr_hard) takes the port lock.

So it looks like ahci_port_intr() needs to take the lock and some
cleanups overall (the host lock should not be necessary in the command
path. But nobody seems to have issues with the "bad" cases... Probably
because they are not mainstream adapters.

Definitely some work needed here.

-- 
Damien Le Moal
Western Digital Research

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ