lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 22 Jun 2016 13:51:31 +0200
From:	Johannes Thumshirn <jthumshirn@...e.de>
To:	"Martin K. Petersen" <martin.petersen@...cle.com>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Josh Boyer <jwboyer@...oraproject.org>,
	Thorsten Leemhuis <regressions@...mhuis.info>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Quinn Tran <quinn.tran@...gic.com>
Subject: Re: Reported regressions for 4.7 as of Sunday, 2016-06-19

On Tue, Jun 21, 2016 at 09:25:18PM -0400, Martin K. Petersen wrote:
> >>>>> "Linus" == Linus Torvalds <torvalds@...ux-foundation.org> writes:
> 
> >> https://bugzilla.redhat.com/show_bug.cgi?id=1348342
> 
> This first one appears to be a crash in a USB sound doodad and not
> qla2xxx. Also, this appears to be where the 4.5.5 -> 4.5.6 notion comes
> from. So we can probably ignore 4.5.5 as the last good revision.
> 
> Linus> as far as I can tell. And neither of them looks very likely, but
> Linus> what do I know. Adding Martin Petersen and Johannes Thumshirn to
> Linus> the participants just in case they go "Ahh.."
> 
> Doubt it's Johannes' tweak. The qla2xxx crash from the two other
> bugzilla entries is in:
> 
> (gdb) list *qla24xx_process_response_queue+0x49
> 0x27e09 is in qla24xx_process_response_queue (drivers/scsi/qla2xxx/qla_isr.c:2560).
> 2555            if (rsp->msix->cpuid != smp_processor_id()) {
> 2556                    /* if kernel does not notify qla of IRQ's CPU change,
> 2557                     * then set it here.
> 2558                     */
> 2559                    rsp->msix->cpuid = smp_processor_id();
> 2560                    ha->tgt.rspq_vector_cpuid = rsp->msix->cpuid;
> 2561            }
> 2562
> 2563            while (rsp->ring_ptr->signature != RESPONSE_PROCESSED) {
> 2564                    pkt = (struct sts_entry_24xx *)rsp->ring_ptr;
> 
> That particular code went into 4.5 and comes from:
> 
> commit cdb898c52d1dfad4b4800b83a58b3fe5d352edde
> Author: Quinn Tran <quinn.tran@...gic.com>
> Date:   Thu Dec 17 14:57:05 2015 -0500
> 
>     qla2xxx: Add irq affinity notification
> 
>     Register to receive notification of when irq setting change
>     occured.
> 
>     Signed-off-by: Quinn Tran <quinn.tran@...gic.com>
>     Signed-off-by: Himanshu Madhani <himanshu.madhani@...gic.com>
>     Signed-off-by: Nicholas Bellinger <nab@...ux-iscsi.org>
> 
> Quinn?
> 
> -- 
> Martin K. Petersen	Oracle Linux Engineering

Having a quick look at it I _think_ this could be the problem.
We request the IRQ _before_ actually assigning the rsp->msix entry. Now If an
IRQ triggers, before the assignment we touch rsp->msix->cpuid, which is
probably the case. At least from what I conduct from Martin's mail.

diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
index 5649c20..20743a3 100644
--- a/drivers/scsi/qla2xxx/qla_isr.c
+++ b/drivers/scsi/qla2xxx/qla_isr.c
@@ -3086,6 +3086,8 @@ qla24xx_enable_msix(struct qla_hw_data *ha, struct rsp_que *rsp)
 	/* Enable MSI-X vectors for the base queue */
 	for (i = 0; i < 2; i++) {
 		qentry = &ha->msix_entries[i];
+		qentry->rsp = rsp;
+		rsp->msix = qentry;
 		if (IS_P3P_TYPE(ha))
 			ret = request_irq(qentry->vector,
 				qla82xx_msix_entries[i].handler,
@@ -3097,8 +3099,6 @@ qla24xx_enable_msix(struct qla_hw_data *ha, struct rsp_que *rsp)
 		if (ret)
 			goto msix_register_fail;
 		qentry->have_irq = 1;
-		qentry->rsp = rsp;
-		rsp->msix = qentry;
 
 		/* Register for CPU affinity notification. */
 		irq_set_affinity_notifier(qentry->vector, &qentry->irq_notify);
@@ -3119,12 +3119,12 @@ qla24xx_enable_msix(struct qla_hw_data *ha, struct rsp_que *rsp)
 	 */
 	if (QLA_TGT_MODE_ENABLED() && IS_ATIO_MSIX_CAPABLE(ha)) {
 		qentry = &ha->msix_entries[ATIO_VECTOR];
+		qentry->rsp = rsp;
+		rsp->msix = qentry;
 		ret = request_irq(qentry->vector,
 			qla83xx_msix_entries[ATIO_VECTOR].handler,
 			0, qla83xx_msix_entries[ATIO_VECTOR].name, rsp);
 		qentry->have_irq = 1;
-		qentry->rsp = rsp;
-		rsp->msix = qentry;
 	}
 
 msix_register_fail:


I'm not sure if we need the qentry->have_irq assingment as well, I'm not 
deep enough into the qla2xx driver yet, maybe Quinn can clarify.
Beware of the above change being untested.

Byte,
	Johannes
-- 
Johannes Thumshirn                                          Storage
jthumshirn@...e.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ