lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20171003114242.603943453@linuxfoundation.org>
Date:   Tue,  3 Oct 2017 14:28:52 +0200
From:   Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To:     linux-kernel@...r.kernel.org
Cc:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        stable@...r.kernel.org,
        "Guilherme G. Piccoli" <gpiccoli@...ux.vnet.ibm.com>,
        Dave Carroll <david.carroll@...rosemi.com>,
        "Martin K. Petersen" <martin.petersen@...cle.com>
Subject: [PATCH 4.13 030/110] scsi: aacraid: Add a small delay after IOP reset

4.13-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Guilherme G. Piccoli <gpiccoli@...ux.vnet.ibm.com>

commit d1b490939d8c117a06dfc562c41d933f71d30289 upstream.

Commit 0e9973ed3382 ("scsi: aacraid: Add periodic checks to see IOP reset
status") changed the way driver checks if a reset succeeded. Now, after an
IOP reset, aacraid immediately start polling a register to verify the reset
is complete.

This behavior cause regressions on the reset path in PowerPC (at least).
Since the delay after the IOP reset was removed by the aforementioned patch,
the fact driver just starts to read a register instantly after the reset
was issued (by writing in another register) "corrupts" the reset procedure,
which ends up failing all the time.

The issue highly impacted kdump on PowerPC, since on kdump path we
proactively issue a reset in adapter (through the reset_devices kernel
parameter).

This patch (re-)adds a delay right after IOP reset is issued. Empirically
we measured that 3 seconds is enough, but for safety reasons we delay
for 5s (and since it was 30s before, 5s is still a small amount).

For reference, without this patch we observe the following messages
on kdump kernel boot process:

  [ 76.294] aacraid 0003:01:00.0: IOP reset failed
  [ 76.294] aacraid 0003:01:00.0: ARC Reset attempt failed
  [ 86.524] aacraid 0003:01:00.0: adapter kernel panic'd ff.
  [ 86.524] aacraid 0003:01:00.0: Controller reset type is 3
  [ 86.524] aacraid 0003:01:00.0: Issuing IOP reset
  [146.534] aacraid 0003:01:00.0: IOP reset failed
  [146.534] aacraid 0003:01:00.0: ARC Reset attempt failed

Fixes: 0e9973ed3382 ("scsi: aacraid: Add periodic checks to see IOP reset status")
Signed-off-by: Guilherme G. Piccoli <gpiccoli@...ux.vnet.ibm.com>
Acked-by: Dave Carroll <david.carroll@...rosemi.com>
Signed-off-by: Martin K. Petersen <martin.petersen@...cle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@...uxfoundation.org>

---
 drivers/scsi/aacraid/src.c |    2 ++
 1 file changed, 2 insertions(+)

--- a/drivers/scsi/aacraid/src.c
+++ b/drivers/scsi/aacraid/src.c
@@ -740,6 +740,8 @@ static void aac_send_iop_reset(struct aa
 	aac_set_intx_mode(dev);
 
 	src_writel(dev, MUnit.IDR, IOP_SRC_RESET_MASK);
+
+	msleep(5000);
 }
 
 static void aac_send_hardware_soft_reset(struct aac_dev *dev)


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ