lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20151203150222.GH23396@atomide.com>
Date:	Thu, 3 Dec 2015 07:02:22 -0800
From:	Tony Lindgren <tony@...mide.com>
To:	Sekhar Nori <nsekhar@...com>
Cc:	John Ogness <john.ogness@...utronix.de>,
	Thomas Gleixner <tglx@...utronix.de>,
	Jason Cooper <jason@...edaemon.net>,
	Marc Zyngier <marc.zyngier@....com>,
	Felipe Balbi <balbi@...com>,
	Linux OMAP Mailing List <linux-omap@...r.kernel.org>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] irqchip: omap-intc: fix spurious irq handling

* Sekhar Nori <nsekhar@...com> [151203 03:29]:
> On Tuesday 20 October 2015 08:22 PM, Tony Lindgren wrote:
> > 
> > OK thanks for testing. My guess from the above list would be EDMA
> > or CPSW missing a flush of posted write. Maybe try adding a readback
> > of the related device revision register after acking the interrupt into
> > TPCC interrupt handler and CPSW interrupt handler(s)?
> 
> I could get back to debugging this only now. I have converted
> __raw_writel to writel() and also added readback from the same register
> in both EDMA and CPSW drivers. But I am still able to reproduce the
> spurious irq reports.
> 
> > The timer2 and uart0 seem to be false positives here naturally.
> 
> I also added readback in 8250 driver. I haven't touched the timer
> driver, but I guess if that driver had an issue, it should have come out
> much earlier.
> 
> I also saw that sometimes previous irq was the TI LCDC interrupt. Added
> readback there too. Did not help.

OK strange, so far all the ones we've seen have been fixable that way.

> > I would not yet rule out the "previous interrupt" theory until you have
> > tried that. We really want to know the root cause of the issue, just
> > printing out spurious interrupt does not fix the problem :)
> 
> While we cannot rule out a software issue completely, the description in
> TRM around spurious interrupts suggests it can happen even with no role
> of software.

Yes maybe we more than one reason for them.

> May I suggest we go ahead and add this patch to the kernel after
> addressing Thomas's comment? At least it will prevent kernel from
> locking up with flood of prints when a spurious irq happens and allows
> easier debug by others too.

Yes we should naturally fix up the kernel locking.

Please also add something like "enable debug for more information"
to the warning. And then print out the current and previous interrupt
if DEBUG is enabled. And in the comments mention that often the spurious
interrupts has been fixed by adding a flush of the posted write to the
previous interrupt handler in the device driver.

Also, do you have a reproducable test case with mainline kernel I
could add to my collection of shell scripts?

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ