lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 6 Jan 2022 13:03:49 +0000
From:   <Ajish.Koshy@...rochip.com>
To:     <john.garry@...wei.com>, <damien.lemoal@...nsource.wdc.com>,
        <jejb@...ux.ibm.com>, <martin.petersen@...cle.com>,
        <jinpu.wang@...ud.ionos.com>, <Viswas.G@...rochip.com>
CC:     <linux-scsi@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        <vishakhavc@...gle.com>, <ipylypiv@...gle.com>,
        <Ruksar.devadi@...rochip.com>,
        <Vasanthalakshmi.Tharmarajan@...rochip.com>
Subject: RE: [PATCH RFT] scsi: pm8001: Fix FW crash for maxcpus=1

> On 05/01/2022 04:03, Damien Le Moal wrote:
> > On 1/5/22 03:26, John Garry wrote:
> >> According to the comment in check_fw_ready() we should not check the
> >> IOP1_READY field in register SCRATCH_PAD_1 for 8008 or 8009
> controllers.
> >>
> >> However we check this very field in process_oq() for processing the
> >> highest index interrupt vector. Change that function to not check
> >> IOP1_READY for those mentioned controllers, but do check ILA_READY in
> both cases.
> >>
> >> The reason I assume that this was not hit earlier was because we
> >> always allocated 64 MSI(X), and just did not pass the vector index
> >> check in process_oq(), i.e.  the handler never ran for vector index 63.
> >>
> >> Signed-off-by: John Garry<john.garry@...wei.com>
> >>
> >> diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c
> >> b/drivers/scsi/pm8001/pm80xx_hwi.c
> >> index 2101fc5761c3..77b8bb30615b 100644
> >> --- a/drivers/scsi/pm8001/pm80xx_hwi.c
> >> +++ b/drivers/scsi/pm8001/pm80xx_hwi.c
> >> @@ -4162,9 +4162,16 @@ static int process_oq(struct pm8001_hba_info
> *pm8001_ha, u8 vec)
> >>      u32 regval;
> >>
> >>      if (vec == (pm8001_ha->max_q_num - 1)) {
> >> +            u32 mipsall_ready;
> >> +
> >> +            if ((pm8001_ha->chip_id == chip_8008) ||
> >> +                (pm8001_ha->chip_id == chip_8009))
> > nit: no need for the inner brackets here.
> 
> ok, I can fix that.
> 
> But I would also like opinion from microchip guys/maintainer on why this
> code is here at all. Seems strange in the way we check in this register in the
> interrupt handler for only a specific vector and, also, why we check at all in
> an interrupt handler.

Here is my initial understanding so far based on the code
and data sheet

1. Controller has the capability to communicate
to the host about fatal error condition via configured
interrupt vector MSI/MSI-X.
2. This capability is achieved by setting two fields
 a. Enable Controller Fatal error notification 
    Dowrd 0x1C, Bit[0].
    1 - Enable; 0 - Disable
    Code: pm8001_ha->main_cfg_tbl.pm80xx_tbl.
    fatal_err_interrupt = 0x01;
 b. Fatal Error Interrupt Vector Dword 0x1C, bit[15:8]
    This parameter configures which interrupt vector
    is used to notify the host of the fatal error.
    Code: /* Update Fatal error interrupt vector */
    pm8001_ha->main_cfg_tbl.pm80xx_tbl.
    fatal_err_interrupt |=
    ((pm8001_ha->max_q_num - 1) << 8);

Probably this will be the reason why we check
the vector in process_oq() for processing
controller fatal error 

if (vec == (pm8001_ha->max_q_num - 1)) {
 
Please do let me know if it helped in clarification.

> >
> >> +                    mipsall_ready = SCRATCH_PAD_MIPSALL_READY_8PORT;
> >> +            else
> >> +                    mipsall_ready =
> >> + SCRATCH_PAD_MIPSALL_READY_16PORT;
> >> +
> >>              regval = pm8001_cr32(pm8001_ha, 0, MSGU_SCRATCH_PAD_1);
> >> -            if ((regval & SCRATCH_PAD_MIPSALL_READY) !=
> >> -                                    SCRATCH_PAD_MIPSALL_READY) {
> >> +            if ((regval & mipsall_ready) != mipsall_ready) {
> >>                      pm8001_ha->controller_fatal_error = true;
> >>                      pm8001_dbg(pm8001_ha, FAIL,
> >>                                 "Firmware Fatal error!
> >> Regval:0x%x\n", diff --git a/drivers/scsi/pm8001/pm80xx_hwi.h
> >> b/drivers/scsi/pm8001/pm80xx_hwi.h
> >> index c7e5d93bea92..c41ed039c92a 100644
> >> --- a/drivers/scsi/pm8001/pm80xx_hwi.h
> >> +++ b/drivers/scsi/pm8001/pm80xx_hwi.h
> >> @@ -1405,8 +1405,12 @@ typedef struct SASProtocolTimerConfig
> SASProtocolTimerConfig_t;
> >>   #define SCRATCH_PAD_BOOT_LOAD_SUCCESS      0x0
> >>   #define SCRATCH_PAD_IOP0_READY             0xC00
> >>   #define SCRATCH_PAD_IOP1_READY             0x3000
> >> -#define SCRATCH_PAD_MIPSALL_READY   (SCRATCH_PAD_IOP1_READY |
> \
> >> +#define SCRATCH_PAD_MIPSALL_READY_16PORT
> (SCRATCH_PAD_IOP1_READY | \
> >>                                      SCRATCH_PAD_IOP0_READY | \
> >> +                                    SCRATCH_PAD_ILA_READY | \
> >> +                                    SCRATCH_PAD_RAAE_READY)
> >> +#define SCRATCH_PAD_MIPSALL_READY_8PORT
> (SCRATCH_PAD_IOP0_READY | \
> >> +                                    SCRATCH_PAD_ILA_READY | \
> >>                                      SCRATCH_PAD_RAAE_READY)
> >>
> >>   /* boot loader state */
> > Otherwise, looks OK to me.
> > I tested with and without max_cpus=1 with a ATTO Technology, Inc.
> > ExpressSAS 12Gb/s SAS/SATA HBA (rev 06) adapter and everything is OK.
> > That adapter uses chip_8072 though, not 8008 or 8009.
> >
> > Feel free to add:
> >
> > Reviewed-by: Damien Le Moal<damien.lemoal@...nsource.wdc.com>
> > Tested-by: Damien Le Moal<damien.lemoal@...nsource.wdc.com>
> 
> Thanks!
> 
> john

Thanks,

Ajish

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ