lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Tue, 6 Sep 2022 08:23:54 +0900
From:   Damien Le Moal <damien.lemoal@...nsource.wdc.com>
To:     John Garry <john.garry@...wei.com>, jejb@...ux.ibm.com,
        martin.petersen@...cle.com, jinpu.wang@...ud.ionos.com,
        yangxingui@...wei.com
Cc:     linux-scsi@...r.kernel.org, linux-kernel@...r.kernel.org,
        linuxarm@...wei.com, hare@...e.de
Subject: Re: [PATCH v2 2/6] scsi: libsas: Add sas_ata_device_link_abort()

On 9/3/22 01:19, John Garry wrote:
> Hi Damien,
> 
>>>>
>>>> But the pm8001 manual and current driver indicate that the
>>>> OPC_INB_SATA_ABORT command should be sent after read log ext when
>>>> handling NCQ error, regardless of an autopsy. I send OPC_INB_SATA_ABORT
>>>> in ata_eh_reset() -> pm8001_I_T_nexus_reset() -> pm8001_send_abort_all()
>>> You lost me: ata_eh_recover() will call ata_eh_reset() only if the 
>>> ATA_EH_RESET
>>> action flag is set. So are you saying that even though it is not 
>>> needed, you
>>> still need to set ATA_EH_RESET for pm8001 ?
>>
>> As below, it was the only location I found suitable to call 
>> pm8001_send_abort_all().
>>
>> However I am not really sure it is required now. For pm8001 NCQ error 
>> handling we require 2x steps:
>> - read log ext
>> - Send OPC_INB_SATA_ABORT - we do this in pm8001_send_abort_all()
>>
>> pm8001_send_abort_all() sends OPC_INB_SATA_ABORT in "device abort all" 
>> mode, meaning any IO in the HBA is aborted for the device. But we are 
>> also earlier in EH sending OPC_INB_SATA_ABORT for individual IOs in 
>> sas_eh_handle_sas_errors() -> sas_scsi_find_task() -> 
>> pm8001_abort_task() -> sas_execute_internal_abort_single() -> ... 
>> send_abort_task()
>>
>> So I don't think that the pm8001_send_abort_all() call has any effect, 
>> as we're already aborting any outstanding IO earlier.
>>
>> Admittedly the order of the 2x steps is different, but 
>> OPC_INB_SATA_ABORT does not send any protocol message to the disk, so 
>> would not affect anything subsequently read with read log ext.
>>
>> Having said all that, it may be wise to still send 
>> pm8001_send_abort_all()..
> 
> Have you had a chance to consider all this yet?
> 
> I am now starting to think that it is not necessary to call 
> pm8001_send_abort_all(), and instead rely only on 
> sas_eh_handle_sas_errors() -> sas_scsi_find_task() -> 
> pm8001_abort_task() -> sas_execute_internal_abort_single() -> ... -> 
> send_abort_task() to abort each outstanding IO. Then we would not have 
> the issue of forcing the reset in $subject (to lead to calling 
> pm8001_send_abort_all()).
> 
> This idea could simply be tested by stubbing pm8001_send_abort_all()
> (and dropping the |= ATA_EH_RESET in $subject).

Send a re-spinned patch and I will test :)

> 
> Thanks,
> John
> 

-- 
Damien Le Moal
Western Digital Research

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ