lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 18 Sep 2020 11:21:59 +0100
From:   Marc Zyngier <maz@...nel.org>
To:     James Morse <james.morse@....com>
Cc:     jonathanh@...dia.com, linux-arm-kernel@...ts.infradead.org,
        linux-kernel@...r.kernel.org, Sumit Garg <sumit.garg@...aro.org>,
        kernel-team@...roid.com, Florian Fainelli <f.fainelli@...il.com>,
        Russell King <linux@....linux.org.uk>,
        Jason Cooper <jason@...edaemon.net>,
        Saravana Kannan <saravanak@...gle.com>,
        Andrew Lunn <andrew@...n.ch>,
        Catalin Marinas <catalin.marinas@....com>,
        Gregory Clement <gregory.clement@...tlin.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Will Deacon <will@...nel.org>,
        Valentin Schneider <valentin.schneider@....com>
Subject: Re: [PATCH v3 08/16] irqchip/gic: Configure SGIs as standard interrupts

Hi James,

On Fri, 18 Sep 2020 10:58:45 +0100,
James Morse <james.morse@....com> wrote:
> 
> Hi Marc,
> 
> (CC: +Jon)
> 
> On 01/09/2020 15:43, Marc Zyngier wrote:
> > Change the way we deal with GIC SGIs by turning them into proper
> > IRQs, and calling into the arch code to register the interrupt range
> > instead of a callback.
> 
> Your comment "This only works because we don't nest SGIs..." on this
> thread tripped some bad memories from adding the irq-stack. Softirq
> causes us to nest irqs, but only once.
> 
> 
> (I've messed with the below diff to remove the added stuff:)
> 
> > diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
> > index 4ffd62af888f..4be2b62f816f 100644
> > --- a/drivers/irqchip/irq-gic.c
> > +++ b/drivers/irqchip/irq-gic.c
> > @@ -335,31 +335,22 @@ static void __exception_irq_entry gic_handle_irq(struct pt_regs *regs)
> >  		irqstat = readl_relaxed(cpu_base + GIC_CPU_INTACK);
> >  		irqnr = irqstat & GICC_IAR_INT_ID_MASK;
> >  
> > -		if (likely(irqnr > 15 && irqnr < 1020)) {
> > -			if (static_branch_likely(&supports_deactivate_key))
> > -				writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI);
> > -			isb();
> > -			handle_domain_irq(gic->domain, irqnr, regs);
> > -			continue;
> > -		}
> > -		if (irqnr < 16) {
> >  			writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI);
> > -			if (static_branch_likely(&supports_deactivate_key))
> > -				writel_relaxed(irqstat, cpu_base + GIC_CPU_DEACTIVATE);
> > -#ifdef CONFIG_SMP
> > -			/*
> > -			 * Ensure any shared data written by the CPU sending
> > -			 * the IPI is read after we've read the ACK register
> > -			 * on the GIC.
> > -			 *
> > -			 * Pairs with the write barrier in gic_raise_softirq
> > -			 */
> >  			smp_rmb();
> > -			handle_IPI(irqnr, regs);
> 
> If I read this right, previously we would EOI the interrupt before
> calling handle_IPI().  Where as now with the version of this series
> in your tree, we stuff the to-be-EOId value in a percpu variable,
> which is only safe if these don't nest.
> 
> Hidden in irq_exit(), kernel/softirq.c::__irq_exit_rcu() has this:
> |	preempt_count_sub(HARDIRQ_OFFSET);
> |	if (!in_interrupt() && local_softirq_pending())
> |		invoke_softirq();
> 
> The arch code doesn't raise the preempt counter by HARDIRQ, so once
> __irq_exit_rcu() has dropped it, in_interrupt() returns false, and
> we invoke_softirq().
> 
> invoke_softirq() -> __do_softirq() -> local_irq_enable()!
> 
> Fortunately, __do_softirq() raises the softirq count first using
> __local_bh_disable_ip(), which in-interrupt() checks too, so this
> can only happen once per IRQ.
> 
> Now the irq_exit() has moved from handle_IPI(), which ran after EOI,
> into handle_domain_irq(), which runs before. I think its possible
> SGIs nest, and the new percpu variable becomes corrupted.

I can't see how. The interrupt is active until we EOI/deactivate it,
and thus cannot be observed again by the CPU interface until this
happens.

Furthermore, irq_exit() in __handle_domain_irq() is *after* the EOI
anyway (generic_handle_irq_() directly calls the flow, which
immediately EOIs the interrupt). The only material change is that
irq_enter() happens before EOI. Is that what you are referring to?

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ