lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <876122c7vd.fsf@linutronix.de>
Date:	Tue, 20 Oct 2015 09:32:54 +0200
From:	John Ogness <john.ogness@...utronix.de>
To:	Sekhar Nori <nsekhar@...com>
Cc:	Tony Lindgren <tony@...mide.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Jason Cooper <jason@...edaemon.net>,
	Marc Zyngier <marc.zyngier@....com>,
	Felipe Balbi <balbi@...com>,
	Linux OMAP Mailing List <linux-omap@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] irqchip: omap-intc: fix spurious irq handling

On 2015-10-20, Sekhar Nori <nsekhar@...com> wrote:
>> Do you know what really is causing the spurious interrupts in your
>> case?
>
> No, not yet.

According to the TRM this is normal behavior if conditions that might
affect priority are changed during priority sorting.

    6.2.5 ARM A8 INTC Spurious Interrupt Handling

    The spurious flag indicates whether the result of the sorting (a
    window of 10 INTC functional clock cycles after the interrupt
    assertion) is invalid. The sorting is invalid if:

    - The interrupt that triggered the sorting is no longer active
      during the sorting.

    - A change in the mask has affected the result during the sorting
      time.

>> In all the cases I've seen, the spurious interrupts were caused by a
>> missing flush of posted write acking the IRQ at the device driver.
>> for the _previously triggered_ INTC interrupt.
>> 
>> If you have a reproducable case, I suggest you test that by printing
>> out the previous interrupt to check if that makes sense. And then see
>> if adding the missing read back to that interrupt handler fixes the
>> issue.
>
> Okay, thats good to know. Thanks for the hints and history of your debug
> on OMAP3. The issue is not easily reproducible in my case. But if I try
> hard enough, I can get hit it though. So I can surely try your hints.

I can reproduce the situation very easily. After running a test for a
few minutes and printing out the previous interrupt, I have the
following list. These are the irq numbers seen by the handler before the
spurious interrupt triggered.

    INT12 - EDMACOMPINT - TPCC (EDMA)
    INT41 - 3PGSWRXINT0 - CPSW (Ethernet)
    INT42 - 3PGSWTXINT0 - CPSW (Ethernet)
    INT68 - TINT2       - DMTIMER2
    INT72 - UART0INT    - UART0

>From this I do not think we can put the blame on any single driver. I
trigger this situation very easily by putting a load of 7,000+
interrupts per second on the system. This means we have 70,000 INTC
clock cycles per second where a change in the interrupt priority
conditions would cause the priority sorting to become invalid and thus
cause the spurious interrupt.

I'm not sure if we can/should do anything more than Sekhar's patch of
acknowledging the spurious interrupt so the priority sorting algorithm
can run again.

John Ogness
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ