lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <A57AEA84-5CA0-403E-8053-106033C73C70@fb.com>
Date:   Wed, 30 Aug 2023 19:37:27 +0000
From:   Michal Grzedzicki <mge@...a.com>
To:     "linux-scsi@...r.kernel.org" <linux-scsi@...r.kernel.org>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: pm80xx: Issues with SATA drives behind expander

Hi,
I'm trying to run Linux 6.5-rc6 on a x86_64 system with an old Adaptec HBA
using pm80xx driver (Adaptec Device 8074 Subsystem: PMC-Sierra Inc. Device 0800).

Machine has 2 expanders 10 SATA disks each and 2 SAS drives connected directly.

pm80xx -----  port0 (3 phy's) ---> exp0 ---> SATA, ..SATA, SES enc0 * works
      \----------  port1               ---> SAS * works
       \---------  port2 (3 phy's) ---> exp1 ---> SATA, ..SATA, SES enc1 * does not work
        \--------  port3                ---> SAS * works


If CONFIG_SCSI_SAS_ATA is not enabled, machine only discovers 2 SAS drives
and works correctly.

When it's enabled kernel runs out of reserved task tags and never finish the discovery.

Both expanders have the same sas address, but they are connected to different ports.

If I pass "libata.dma=0 libata.force=noncq" and with bellow changes kernel is able to detect drives on the first expander,
drives on the second expander are detected by the link layer but they all fail to complete ata IDENTIFY commands.

[pm80xx] : Do not leak reserved tag in mpi_set_controller_config_resp()
Save 1 tag from leaking.

diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
index 97f54fbb3812..3a6157b9a77b 100644
--- a/drivers/scsi/pm8001/pm80xx_hwi.c
+++ b/drivers/scsi/pm8001/pm80xx_hwi.c
@@ -3673,10 +3673,12 @@ static int mpi_set_controller_config_resp(struct pm8001_hba_info *pm8001_ha,
                       (struct set_ctrl_cfg_resp *)(piomb + 4);
       u32 status = le32_to_cpu(pPayload->status);
       u32 err_qlfr_pgcd = le32_to_cpu(pPayload->err_qlfr_pgcd);
+       u32 tag = le32_to_cpu(pPayload->tag);

       pm8001_dbg(pm8001_ha, MSG,
                  "SET CONTROLLER RESP: status 0x%x qlfr_pgcd 0x%x\n",
                  status, err_qlfr_pgcd);
+       pm8001_tag_free(pm8001_ha, tag);

       return 0;
}


[pm80xx] : Decrease running_req for null tasks in mpi_sata_completion
Without it the discovery process never finishes

diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c
index 39a12ee94a72..97f54fbb3812 100644
--- a/drivers/scsi/pm8001/pm80xx_hwi.c
+++ b/drivers/scsi/pm8001/pm80xx_hwi.c
@@ -2292,6 +2292,8 @@ mpi_sata_completion(struct pm8001_hba_info *pm8001_ha,
               pm8001_dbg(pm8001_ha, FAIL, "task null, freeing CCB tag %d\n",
                          ccb->ccb_tag);
               pm8001_ccb_free(pm8001_ha, ccb);
+               if (pm8001_dev)
+                       atomic_dec(&pm8001_dev->running_req);
               return;
       }


[pm80xx] : Increase PM8001_RESERVE_SLOT so it can abort jobs on more than 8 devices
Without it driver runs out of tags and loops while trying to abort all 10 failed ata IDENTIFY commands.

diff --git a/drivers/scsi/pm8001/pm8001_defs.h b/drivers/scsi/pm8001/pm8001_defs.h
index 501b574239e8..f7d348165f7e 100644
--- a/drivers/scsi/pm8001/pm8001_defs.h
+++ b/drivers/scsi/pm8001/pm8001_defs.h
@@ -90,7 +90,7 @@ enum port_type {
#define        PM8001_MAX_PORTS         16     /* max. possible ports */
#define        PM8001_MAX_DEVICES       2048   /* max supported device */
#define        PM8001_MAX_MSIX_VEC      64     /* max msi-x int for spcv/ve */
-#define        PM8001_RESERVE_SLOT      8
+#define        PM8001_RESERVE_SLOT      64

#define        CONFIG_SCSI_PM8001_MAX_DMA_SG   528
#define PM8001_MAX_DMA_SG      CONFIG_SCSI_PM8001_MAX_DMA_SG


Both expanders are visible, and discovers the devices correctly using smp. Same HW works correctly on FreeBSD,
and the devices discovered over smp discovery are consistent with ones reported by FreeBSD's camcontrol smpphylist.

# smp_discover_list  /dev/bsg/expander-0\:0
 phy   0:U:attached:[500e004abbbbbb00:00  t(SATA)]  6 Gbps  ZG:10
 phy   1: inaccessible (phy vacant)
 phy   2:U:attached:[500e004abbbbbb02:00  t(SATA)]  6 Gbps  ZG:10
 phy   3: inaccessible (phy vacant)
..
 phy  22:U:attached:[ffffffffffffffff:00  i(SSP+STP+SMP)]  12 Gbps  ZG:8
 phy  23:U:attached:[ffffffffffffffff:01  i(SSP+STP+SMP)]  12 Gbps  ZG:8
..
 phy  36:D:attached:[500e004abbbbbb7e:36  V i(SSP) t(SSP)]  12 Gbps  ZG:2

# smp_discover_list  /dev/bsg/expander-0\:1
 phy   0: inaccessible (phy vacant)
 phy   1:U:attached:[500e004abbbbbb01:00  t(SATA)]  6 Gbps  ZG:11
 phy   2: inaccessible (phy vacant)
 phy   3:U:attached:[500e004abbbbbb03:00  t(SATA)]  6 Gbps  ZG:11
 phy   4:U:attached:[500e004abbbbbb04:00  t(SATA)]  6 Gbps  ZG:11
...
 phy  36:D:attached:[500e004abbbbbb7e:36  V i(SSP) t(SSP)]  12 Gbps  ZG:2


working SATA drive
# smp_rep_phy_sata -p 0 /dev/bsg/expander-0\:0
Report phy SATA response:
 expander change count: 36861
 phy identifier: 0
 STP I_T nexus loss occurred: 0
 affiliations supported: 1
 affiliation valid: 1
 STP SAS address: 0x500e004abbbbbb00
 register device to host FIS:
   34 00 50 01 01 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00
 affiliated STP initiator SAS address: 0xffffffffffffffff
 STP I_T nexus loss SAS address: 0x0
 affiliation context: 0
 current affiliation contexts: 1
 maximum affiliation contexts: 1

not working on the second expander
# smp_rep_phy_sata -p 3 /dev/bsg/expander-0\:1
Report phy SATA response:
 expander change count: 36861
          ^^^^^^^^^
reported change count is the same for both expanders, that looks suspicious

 phy identifier: 3
 STP I_T nexus loss occurred: 1
 affiliations supported: 1
 affiliation valid: 0
         ^^^^^^^
affiliation valid is zero

 STP SAS address: 0x500e004abbbbbb03
 register device to host FIS:
   34 00 50 01 01 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00
 affiliated STP initiator SAS address: 0xffffffffffffffff
        ^^^^^
does this mean the affiliation was successful but was undone by nexus loss or other event ?

 STP I_T nexus loss SAS address: 0xffffffffffffffff
 affiliation context: 0
 current affiliation contexts: 0
 maximum affiliation contexts: 1

Logs:
https://gist.github.com/mge-fbe-com/084abe34038f5f10630b5c4519f301d2

Verbose logs:
https://gist.github.com/mge-fbe-com/a7c830599e6cc7f8017b4722bb58a901


Thanks,
Michal

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ