linux-kernel - Re: [PATCH 2/2] genirq: fasteoi resends interrupt on concurrent invoke

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <3903a508c15e7a75b6d637c8523c3bae13d6a7af.camel@amazon.com>
Date:   Thu, 1 Jun 2023 07:24:48 +0000
From:   "Gowans, James" <jgowans@...zon.com>
To:     "maz@...nel.org" <maz@...nel.org>
CC:     "tglx@...utronix.de" <tglx@...utronix.de>,
        "Raslan, KarimAllah" <karahmed@...zon.com>,
        "liaochang1@...wei.com" <liaochang1@...wei.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "zouyipeng@...wei.com" <zouyipeng@...wei.com>,
        "chris.zjh@...wei.com" <chris.zjh@...wei.com>
Subject: Re: [PATCH 2/2] genirq: fasteoi resends interrupt on concurrent
 invoke

On Wed, 2023-05-31 at 08:00 +0100, Marc Zyngier wrote:
> > Generally it should not be possible for the next interrupt to arrive
> > while the previous handler is still running: the next interrupt should
> > only arrive after the EOI message has been sent and the previous handler
> > has returned.
> 
> There is no such message with LPIs. I pointed that out previously.

Arg, thanks, I'll re-word this to:

"Generally it should not be possible for the next interrupt to arrive
while the previous handler is still running: the CPU will not preempt an
interrupt with another from the same source or same priority."

I hope that's more accurate?

> > This issue was observed specifically on an arm64 system with a GIC-v3
> > handling MSIs; GIC-v3 uses the handle_fasteoi_irq handler. The issue is
> > that the global ITS is responsible for affinity but does not know
> > whether interrupts are pending/running, only the CPU-local redistributor
> > handles the EOI. Hence when the affinity is changed in the ITS, the new
> > CPU's redistributor does not know that the original CPU is still running
> > the handler.
> 
> Similar to your previous patch, you don't explain *why* the interrupt
> gets delivered when it is an LPI, and not for any of the other GICv3
> interrupt types. That's an important point.

Right, you pointed out the issue with this sentence too and I missed
updating it. :-/ How about:

"This issue was observed specifically on an arm64 system with a GIC-v3
handling MSIs; GIC-v3 uses the handle_fasteoi_irq handler. The issue is
that the GIC-v3's physical LPIs do not have a global active state. If LPIs
had an active state, then it would not be be able to be retriggered until
the first CPU had issued a deactivation"

> 
> > 
> > +     /*
> > +      * When the race descibed above happens, this will resend the interrupt.
> > +      */
> > +     if (unlikely(desc->istate & IRQS_PENDING))
> > +             check_irq_resend(desc, false);
> > +
> >       raw_spin_unlock(&desc->lock);
> >       return;
> >  out:
> 
> While I'm glad that you eventually decided to use the resend mechanism
> instead of spinning on the "old" CPU, I still think imposing this
> behaviour on all users without any discrimination is wrong.
> 
> Look at what it does if an interrupt is a wake-up source. You'd
> pointlessly requeue the interrupt (bonus points if the irqchip doesn't
> provide a HW-based retrigger mechanism).
> 
> I still maintain that this change should only be applied for the
> particular interrupts that *require* it, and not as a blanket change
> affecting everything under the sun. I have proposed such a change in
> the past, feel free to use it or roll your own.

Thanks for the example of where this blanket functionality wouldn't be
desired - I'll re-work this to introduce and use
the IRQD_RESEND_WHEN_IN_PROGRESS flag as you originally suggested.

Just one more thing before I post V3: are you okay with doing the resend
here *after* the handler finished running, and using the IRQ_PENDING flag
to know to resend it? Or would you like it to be resent in
the !irq_may_run(desc) block as you suggested?

I have a slight preference to do it after, only when we know it's ready to
be run again, and hence not needed to modify check_irq_resend() to cater
for multiple retries.

JG