[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87tuyfxria.wl-maz@kernel.org>
Date: Fri, 10 Jul 2020 17:29:49 +0100
From: Marc Zyngier <maz@...nel.org>
To: Valentin Schneider <valentin.schneider@....com>
Cc: linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
Thomas Gleixner <tglx@...utronix.de>,
Jason Cooper <jason@...edaemon.net>,
Lorenzo Pieralisi <Lorenzo.Pieralisi@....com>
Subject: Re: [RFC PATCH] irqchip/gic: Implement irq_chip->irq_retrigger()
On Fri, 10 Jul 2020 15:56:42 +0100,
Valentin Schneider <valentin.schneider@....com> wrote:
>
> While digging around IRQCHIP_EOI_IF_HANDLED and irq/resend.c, it has come
> to my attention that the IRQ resend situation seems a bit precarious for
> the GIC(s). Below is a (somewhat structured) dump of my notes/thoughts
> about that.
>
> IRQCHIP_EOI_IF_HANDLED
> ======================
>
> If the irqchip doesn't have this flag set, we may issue an irq_eoi()
> despite not having handled the IRQ in the following cases:
>
> o !irq_may_run()
> - IRQ may be in progress (handle_irq_event() ongoing)
> - IRQ is an armed wakeup source (also sets it pending)
>
> o !action or IRQ disabled; in this case it is set as pending.
>
> irq_resend()
> ============
>
> For the above cases where the IRQ is marked as pending, it means that when
> we'll go through check_irq_resend() (e.g. when we re-enable the IRQ) we
> will end up in resend_irqs() which goes through the flow handler again, IOW
> will issue *another* EOI on the same IRQ.
>
> Now, this is all done via a tasklet, so AFAICT cannot happen in irq
> context (as defined by in_interrupt() - it can happen at the tail of
> handling an IRQ if it wasn't nesting).
>
> GIC woes
> ========
>
> The TL;DR for IRQ handling on the GIC is that we have 3 steps:
> o Acknowledgement (Ack)
> o priority drop (PD)
> o deactivation (D)
>
> The GIC also has an "eoimode" knob that says whether PD and D are split, IOW:
> o eoimode=0: irq_eoi() does PD + D
> o eoimode=1: irq_eoi() does D
>
> Regardless of the mode, it is paramount that any PD is
> o issued on the same CPU that Ack'd the IRQ
> o issued in reverse order of the Acks
>
> IHI0069D, 4.1.1 Physical CPU interface, Priority drop
> """
> A priority drop must be performed by the same PE that activated the
> interrupt.
>
> [...]
>
> For each CPU interface, the GIC architecture requires the order of the
> valid writes to ICC_EOIR0_EL1 and ICC_EOIR1_EL1 to be the exact reverse of
> the order of the reads from ICC_IAR0_EL1 and ICC_IAR1_EL1
> """
>
> IHI0069D, 8.2.9 ICC_EOIR0_EL1, Interrupt Controller End Of Interrupt Register 0
> """
> A write to this register must correspond to the most recent valid read by
> this PE from an Interrupt Acknowledge Register, and must correspond to the
> INTID that was read from ICC_IAR0_EL1, otherwise the system behavior is
> UNPREDICTABLE.
> """
>
> For eoimode=1, the PD is hidden from genirq and is contained within the GIC
> driver. This means that issuing another irq_eoi() will only be re-issuing a
> D, which I think the GIC can live with (even if issued from a different CPU).
>
> For eoimode=0, it is more troubling, as we break the aforementioned
> restrictions. That said, IIUC this cannot cause e.g. a bogus running
> priority reduction due to the tasklet context mentioned above (running
> priority ought to be idle priority).
>
> Linux hosts will almost always use eoimode=1, so that leaves us with
> guests running with eoimode=0, and I don't know how common it is (if at all
> possible) for those to use pm / wakeup IRQs. I suppose that is a reason
> why this hasn't cropped up before, that or I miserably misunderstood the
> whole thing.
>
> In any case, the virtual interface follows the same restrictions wrt
> PD ordering:
>
> IHI0069D 8.3.7 ICV_EOIR0_EL1, Interrupt Controller Virtual End Of Interrupt Register 0
> """
> A write to this register must correspond to the most recent valid read by
> this vPE from a Virtual Interrupt Acknowledge Register, and must correspond
> to the INTID that was read from ICV_IAR0_EL1, otherwise the system behavior
> is UNPREDICTABLE.
> """
>
> Change
> ======
>
> One way to ensure we only get one PD per interrupt activation and maintain
> the expected ordering is to add the IRQCHIP_EOI_IF_HANDLED flag to all
> irq-gic chips, but that only really works for eoimode=1; for eoimode=0 that
> would mean leaving the flow handler without having issued a PD at all.
>
> At the same time, this whole IRQS_PENDING & resend thing feels like
> something we can handle in hardware: we can leverage
> irq_chip.irq_retrigger() and use this to mark the interrupt as pending in
> the GIC itself, in which case we *don't* want to have
> IRQCHIP_EOI_IF_HANDLED (as the retrigger will lead to another ack+eoi
> pair).
>
> Implement irq_chip.irq_retrigger() for both GICs.
Although I am very grateful for the whole documentation, I'd rather
have a slightly more condensed changelog that documents the
implementation of the retrigger callback! ;-)
>
> Signed-off-by: Valentin Schneider <valentin.schneider@....com>
> ---
> drivers/irqchip/irq-gic-v3.c | 7 +++++++
> drivers/irqchip/irq-gic.c | 6 ++++++
> 2 files changed, 13 insertions(+)
>
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index cc46bc2d634b..c025e8b51464 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -1207,6 +1207,11 @@ static int gic_set_affinity(struct irq_data *d, const struct cpumask *mask_val,
> #define gic_smp_init() do { } while(0)
> #endif
>
> +static int gic_retrigger(struct irq_data *data)
> +{
> + return gic_irq_set_irqchip_state(data, IRQCHIP_STATE_PENDING, true);
If I'm not mistaken, check_irq_resend() requires a non-zero return
value if the retrigger has succeeded. So something like
return !gic_irq_set_irqchip_state(data, IRQCHIP_STATE_PENDING, true);
would be more appropriate.
Otherwise, looks good.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
Powered by blists - more mailing lists