lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Fri, 07 Feb 2020 16:11:59 +0000 From: Marc Zyngier <maz@...nel.org> To: Lukas Wunner <lukas@...ner.de> Cc: Thomas Gleixner <tglx@...utronix.de>, Jason Cooper <jason@...edaemon.net>, Nicolas Saenz Julienne <nsaenzjulienne@...e.de>, Florian Fainelli <f.fainelli@...il.com>, Ray Jui <rjui@...adcom.com>, Scott Branden <sbranden@...adcom.com>, bcm-kernel-feedback-list@...adcom.com, linux-kernel@...r.kernel.org, linux-rpi-kernel@...ts.infradead.org, linux-arm-kernel@...ts.infradead.org, Serge Schneider <serge@...pberrypi.org>, Kristina Brooks <notstina@...il.com>, Stefan Wahren <wahrenst@....net>, Matthias Brugger <mbrugger@...e.com>, Martin Sperl <kernel@...tin.sperl.org>, Phil Elwell <phil@...pberrypi.org> Subject: Re: [PATCH] irqchip/bcm2835: Quiesce IRQs left enabled by bootloader Hi Lukas, On 2020-02-07 15:46, Lukas Wunner wrote: > Customers of our "Revolution Pi" open source PLCs (which are based on > the Raspberry Pi) have reported random lockups as well as jittery eMMC, > UART and SPI latency. We were able to reproduce the lockups in our lab > and hooked up a JTAG debugger: > > It turns out that the USB controller's interrupt is already enabled > when > the kernel boots. All interrupts are disabled when the chip comes out > of power-on reset, according to the spec. So apparently the bootloader > enables the interrupt but neglects to disable it before handing over > control to the kernel. > > The bootloader is a closed source blob provided by the Raspberry Pi > Foundation. Development of an alternative open source bootloader was > begun by Kristina Brooks but it's not fully functional yet. Usage of > the blob is thus without alternative for the time being. > > The Raspberry Pi Foundation's downstream kernel has a performance- > optimized USB driver (which we use on our Revolution Pi products). > The driver takes advantage of the FIQ fast interrupt. Because the > regular USB interrupt was left enabled by the bootloader, both the > FIQ and the normal interrupt is enabled once the USB driver probes. > > The spec has the following to say on simultaneously enabling the FIQ > and the normal interrupt of a peripheral: > > "One interrupt source can be selected to be connected to the ARM FIQ > input. An interrupt which is selected as FIQ should have its normal > interrupt enable bit cleared. Otherwise a normal and an FIQ interrupt > will be fired at the same time. Not a good idea!" Or to spell it out more clearly: Braindead hardware. Really. > ^^^^^^^^^^^^^^^ > https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf > page 110 > > On a multicore Raspberry Pi, the Foundation's kernel routes all normal > interrupts to CPU 0 and the FIQ to CPU 1. Because both the FIQ and the > normal interrupt is enabled, a USB interrupt causes CPU 0 to spin in > bcm2836_chained_handle_irq() until the FIQ on CPU 1 has cleared it. > Interrupts with a lower priority than USB are starved as long. > > That explains the jittery eMMC, UART and SPI latency: On one occasion > I've seen CPU 0 blocked for no less than 2.9 msec. Basically, > everything not USB takes a performance hit: Whereas eMMC throughput > on a Compute Module 3 remains relatively constant at 23.5 MB/s with > this commit, it irregularly dips to 23.0 MB/s without this commit. > > The lockups occur when CPU 0 receives a USB interrupt while holding a > lock which CPU 1 is trying to acquire while the FIQ is temporarily > disabled on CPU 1. > > I've tested old releases of the Foundation's bootloader as far back as > 1.20160202-1 and they all leave the USB interrupt enabled. Still older > releases fail to boot a contemporary kernel on a Compute Module 1 or 3, > which are the only Raspberry Pi variants I have at my disposal for > testing. > > Fix by disabling IRQs left enabled by the bootloader. Although the > impact is most pronounced on the Foundation's downstream kernel, > it seems prudent to apply the fix to the upstream kernel to guard > against such mistakes in any present and future bootloader. > > An alternative, though more convoluted approach would be to clear the > IRQD_IRQ_MASKED flag on all interrupts left enabled on boot. Then the > first invocation of handle_level_irq() would mask and thereby quiesce > those interrupts. Nah, that's terrible. The right thing to do is indeed to mop up the mess that the bootloader is bound to leave and start with a clean slate. > > Signed-off-by: Lukas Wunner <lukas@...ner.de> > Cc: Serge Schneider <serge@...pberrypi.org> > Cc: Kristina Brooks <notstina@...il.com> > Cc: stable@...r.kernel.org > --- > drivers/irqchip/irq-bcm2835.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/irqchip/irq-bcm2835.c > b/drivers/irqchip/irq-bcm2835.c > index 418245d31921..0d9a5a7ebe2c 100644 > --- a/drivers/irqchip/irq-bcm2835.c > +++ b/drivers/irqchip/irq-bcm2835.c > @@ -150,6 +150,13 @@ static int __init armctrl_of_init(struct > device_node *node, > intc.enable[b] = base + reg_enable[b]; > intc.disable[b] = base + reg_disable[b]; > > + irq = readl(intc.enable[b]); readl_relaxed(), please. irq is not quite the right type either, please use a u32. > + if (irq) { > + writel(irq, intc.disable[b]); writel_relaxed(). > + pr_err(FW_BUG "Bootloader left irq enabled: " > + "bank %d irq %*pbl\n", b, IRQS_PER_BANK, &irq); > + } > + > for (i = 0; i < bank_irqs[b]; i++) { > irq = irq_create_mapping(intc.domain, MAKE_HWIRQ(b, i)); > BUG_ON(irq <= 0); Don't you need to do something about the FIQ side as well? M. -- Jazz is not dead. It just smells funny...
Powered by blists - more mailing lists