lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <68953040-1622-254b-f6f8-b279eccacfb1@kernel.org>
Date:   Mon, 22 May 2023 16:02:16 +0900
From:   Damien Le Moal <dlemoal@...nel.org>
To:     yangxingui <yangxingui@...wei.com>, jejb@...ux.ibm.com,
        martin.petersen@...cle.com, john.g.garry@...cle.com
Cc:     linux-scsi@...r.kernel.org, linux-kernel@...r.kernel.org,
        linuxarm@...wei.com, prime.zeng@...ilicon.com,
        kangfenglong@...wei.com
Subject: Re: [PATCH v2] ata: libata-scsi: Fix get identity data failed

On 5/22/23 10:35, Damien Le Moal wrote:
> On 5/8/23 10:11, yangxingui wrote:
>>
>>
>> On 2023/5/7 22:51, Damien Le Moal wrote:
>>> On 2023/05/05 18:06, yangxingui wrote:
>>>>
>>>>
>>>> On 2023/5/5 16:17, Damien Le Moal wrote:
>>>>> On 2023/05/05 11:57, Xingui Yang wrote:
>>>>>> The function ata_get_identity() uses the helper ata_scsi_find_dev() to get
>>>>>> the ata_device structure of a scsi device. However, when the ata device is
>>>>>> managed by libsas, ata_scsi_find_dev() returns NULL, turning
>>>>>> ata_get_identity() into a nop and always returns -ENOMSG.
>>>>>
>>>>> What do you do to hit the issue ? A while back for me it was the queue depth
>>>>> setting causing problems. As Garry mentioned, this led to patch 141f3d6256e5
>>>>> ("ata: libata-sata: Fix device queue depth control").
>>>> Attempt to return the correct value at ata_scsi_find_dev() instead of
>>>> NULL, when the ata device is managed by libsas?
>>>
>>> That I understand. My question is *what* user operation/command triggers this ?
>>> Because on my test setup, under normal use, I do not see this issue (beside what
>>> was already corrected with the queue depth control). Is the issue showing up
>>> when using passthrough commands only ?
>> Yeah, we found that command "hdparm -i /dev/sdc" always return faild for 
>> SATA HDD disk. as follows:
>> [root@...alhost ~]# hdparm -i /dev/sdc
>>
>> /dev/sdc:
>>   HDIO_GET_IDENTITY failed: Invalid argument
> 
> I cannot recreate this issue exactly like this. Here is my setup with a pm80xx
> driver (Adaptec HBA):
> 
> [7:0:0:0]    disk    ATA      WDC  WUH721818AL W232  /dev/sdd   /dev/sg5
> [7:0:1:0]    disk    ATA      WDC  WUH721818AL WTW2  /dev/sdi   /dev/sg6
> [7:0:2:0]    disk    ATA      WDC  WUH722222AL Wf86  /dev/sdf   /dev/sg7
> [7:0:3:0]    zbc     ATA      WDC  WSH722020AL W803  /dev/sdg   /dev/sg8
> 
> Using the first drive, I get:
> 
> sudo hdparm -i /dev/sdd
> 
> /dev/sdd:
> 
>  Model=WDC  WUH721818ALN604, FwRev=PCGNW232, SerialNo=3KG10LBK
>  Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
>  RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=56
>  BuffType=DualPortCache, BuffSize=unknown, MaxMultSect=2, MultSect=off
>  CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=4394582016
>  IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
>  PIO modes:  pio0 pio1 pio2 pio3 pio4
>  DMA modes:  mdma0 mdma1 mdma2
>  UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 *udma6
>  AdvancedPM=yes: disabled (255) WriteCache=enabled
>  Drive conforms to: unknown:  ATA/ATAPI-2,3,4,5,6,7
> 
>  * signifies the current active mode
> 
> So all good. However, for the following drives, I get:
> 
> sudo hdparm -i /dev/sdi
> 
> /dev/sdi:
>  HDIO_GET_IDENTITY failed: No message of desired type
> 
> (same for sdf and sdg).
> 
> Will dig into this.

OK, so the issue is that __ata_scsi_find_dev() calls ata_find_dev() with devno
== scsidev->id. This leads to devno being 0, 1, 2 and 3 for connected drives
sdd, sd1, sdf and sdg, as shown by lsscsi. However, each drive has its own
port+link, with the link for each one having  ata_link_max_devices() == 1, so
ata_find_dev() works only for the first drive with scsidev->id == 0 and fails
for the others. A naive fix would be this:

diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index 7bb12deab70c..e4d6f17d7ccc 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -2718,7 +2718,7 @@ static struct ata_device *__ata_scsi_find_dev(struct
ata_port *ap,
        if (!sata_pmp_attached(ap)) {
                if (unlikely(scsidev->channel || scsidev->lun))
                        return NULL;
-               devno = scsidev->id;
+               devno = 0;
        } else {
                if (unlikely(scsidev->id || scsidev->lun))
                        return NULL;

And running this on my setup, it works. This makes libsas added ports/devices
look like AHCI ones, where all devices have ID 0 for the !pmp case.

However, I am not sure this would be OK for all setups...

John,

Any idea if there is any cases where libsas managed drives would endup not being
correctly identified by this change ? As long as a device always has its own
port, I do not see any issue. But is there a case where we could have multiple
devices on the same port ? Per libata, max is 2, and that is only for the IDE
master/slave case. Otherwise, it is always 1.

Not that looking at the pmp case, I am not confident at all that the
identification is correct for libsas. But I do not think that anyone would ever
connect a pmp box to a libsas HBA...

-- 
Damien Le Moal
Western Digital Research

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ