linux-kernel - Re: [PATCH v3 1/3] scsi: hisi_sas: Enable force phy when SATA disk directly connected

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d233a108-a46e-47dd-86ad-756c60c8665e@oracle.com>
Date: Mon, 24 Feb 2025 12:21:02 +0000
From: John Garry <john.g.garry@...cle.com>
To: yangxingui <yangxingui@...wei.com>, liyihang9@...wei.com,
        yanaijie@...wei.com
Cc: jejb@...ux.ibm.com, martin.petersen@...cle.com, linux-scsi@...r.kernel.org,
        linux-kernel@...r.kernel.org, linuxarm@...wei.com,
        prime.zeng@...wei.com, liuyonglong@...wei.com, kangfenglong@...wei.com,
        liyangyang20@...wei.com, f.fangjian@...wei.com,
        xiabing14@...artners.com
Subject: Re: [PATCH v3 1/3] scsi: hisi_sas: Enable force phy when SATA disk
 directly connected

On 24/02/2025 09:36, yangxingui wrote:
>>
>>
>> So do you mean that all IO to this disk will error? If yes, then this 
>> is good.
> Yes, IO error or IO result does not meet expectations. As shown in the 
> log below, due to an abnormal port ID, the SNs of the two disks read are 
> the same.

Do you mean that this is mainline kernel behaviour, below:

> 
> [448000.504979] hisi_sas_v3_hw 0000:d4:02.0: phyup: phy1 link_rate=10(sata)
> [448000.505070] sas: phy-10:1 added to port-10:1, phy_mask:0x2 
> (5000000000000a01)
> [448000.505247] sas: DOING DISCOVERY on port 1, pid:2239187
> [448000.505255] hisi_sas_v3_hw 0000:d4:02.0: dev[2:5] found
> [448000.505274] sas: Enter sas_scsi_recover_host busy: 0 failed: 0
> [448000.505295] sas: ata31: end_device-10:0: dev error handler
> [448000.505299] sas: ata32: end_device-10:1: dev error handler
> [448001.300517] hisi_sas_v3_hw 0000:d4:02.0: phydown: phy1 phy_state=0x1 
>   // phy1's hw port id released
> [448001.300522] hisi_sas_v3_hw 0000:d4:02.0: ignore flutter phy1 down
> [448001.436187] hisi_sas_v3_hw 0000:d4:02.0: phyup: phy2 
> link_rate=10(sata) // phy2 occupies the hardware port ID of phy1
> [448001.608766] hisi_sas_v3_hw 0000:d4:02.0: phyup: phy1 
> link_rate=10(sata) // phy1 was assigned a new hardware port ID
> [448001.775605] ata32.00: ATA-11: WUH721816ALE6L4, PCGAW660, max UDMA/133
> [448002.159364] sas: phy-10:2 added to port-10:2, phy_mask:0x4 
> (5000000000000a02)
> [448002.159575] sas: DOING DISCOVERY on port 2, pid:2239187
> [448002.159581] hisi_sas_v3_hw 0000:d4:02.0: dev[3:5] found
> [448002.159602] sas: Enter sas_scsi_recover_host busy: 0 failed: 0
> [448002.159623] sas: ata31: end_device-10:0: dev error handler
> [448002.159633] sas: ata32: end_device-10:1: dev error handler
> [448002.159636] sas: ata33: end_device-10:2: dev error handler
> [448002.393349] hisi_sas_v3_hw 0000:d4:02.0: phydown: phy2 phy_state=0x3
> [448002.393354] hisi_sas_v3_hw 0000:d4:02.0: ignore flutter phy2 down
> [448002.684937] hisi_sas_v3_hw 0000:d4:02.0: phyup: phy2 link_rate=10(sata)
> [448002.851639] ata33.00: ATA-11: WUH721816ALE6L4, PCGAW660, max UDMA/133
> [448002.851644] ata33.00: 31251759104 sectors, multi 0: LBA48 NCQ (depth 
> 32)
> 
>>
>> But I still don't like the handling in this patch. If we get a phy up, 
>> then the directly-attached disk ideally should be gone already, so 
>> should not have to do this handling.
> There is no problem when the disk is removed. The current problem is 
> that multiple phy up at the same time. When one of the phys up and 
> enters error handler to execute hardreset, the phy will down and then 
> up. other phy up will probably occupy the hw port id of the previous phy 
> which do hardreset in EH.

Could you do this work (itct update) in lldd_ata_check_ready CB?