linux-kernel - Re: [RFC PATCH] irqchip/gic: Implement irq_chip->irq

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87tuyfxria.wl-maz@kernel.org>
Date:   Fri, 10 Jul 2020 17:29:49 +0100
From:   Marc Zyngier <maz@...nel.org>
To:     Valentin Schneider <valentin.schneider@....com>
Cc:     linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        Thomas Gleixner <tglx@...utronix.de>,
        Jason Cooper <jason@...edaemon.net>,
        Lorenzo Pieralisi <Lorenzo.Pieralisi@....com>
Subject: Re: [RFC PATCH] irqchip/gic: Implement irq_chip->irq_retrigger()

On Fri, 10 Jul 2020 15:56:42 +0100,
Valentin Schneider <valentin.schneider@....com> wrote:
> 
> While digging around IRQCHIP_EOI_IF_HANDLED and irq/resend.c, it has come
> to my attention that the IRQ resend situation seems a bit precarious for
> the GIC(s). Below is a (somewhat structured) dump of my notes/thoughts
> about that.
> 
> IRQCHIP_EOI_IF_HANDLED
> ======================
> 
> If the irqchip doesn't have this flag set, we may issue an irq_eoi()
> despite not having handled the IRQ in the following cases:
> 
> o !irq_may_run()
>   - IRQ may be in progress (handle_irq_event() ongoing)
>   - IRQ is an armed wakeup source (also sets it pending)
> 
> o !action or IRQ disabled; in this case it is set as pending.
> 
> irq_resend()
> ============
> 
> For the above cases where the IRQ is marked as pending, it means that when
> we'll go through check_irq_resend() (e.g. when we re-enable the IRQ) we
> will end up in resend_irqs() which goes through the flow handler again, IOW
> will issue *another* EOI on the same IRQ.
> 
> Now, this is all done via a tasklet, so AFAICT cannot happen in irq
> context (as defined by in_interrupt() - it can happen at the tail of
> handling an IRQ if it wasn't nesting).
> 
> GIC woes
> ========
> 
> The TL;DR for IRQ handling on the GIC is that we have 3 steps:
> o Acknowledgement (Ack)
> o priority drop (PD)
> o deactivation (D)
> 
> The GIC also has an "eoimode" knob that says whether PD and D are split, IOW:
> o eoimode=0: irq_eoi() does PD + D
> o eoimode=1: irq_eoi() does D
> 
> Regardless of the mode, it is paramount that any PD is
> o issued on the same CPU that Ack'd the IRQ
> o issued in reverse order of the Acks
> 
> IHI0069D, 4.1.1 Physical CPU interface, Priority drop
> """
> A priority drop must be performed by the same PE that activated the
> interrupt.
> 
> [...]
> 
> For each CPU interface, the GIC architecture requires the order of the
> valid writes to ICC_EOIR0_EL1 and ICC_EOIR1_EL1 to be the exact reverse of
> the order of the reads from ICC_IAR0_EL1 and ICC_IAR1_EL1
> """
> 
> IHI0069D, 8.2.9 ICC_EOIR0_EL1, Interrupt Controller End Of Interrupt Register 0
> """
> A write to this register must correspond to the most recent valid read by
> this PE from an Interrupt Acknowledge Register, and must correspond to the
> INTID that was read from ICC_IAR0_EL1, otherwise the system behavior is
> UNPREDICTABLE.
> """
> 
> For eoimode=1, the PD is hidden from genirq and is contained within the GIC
> driver. This means that issuing another irq_eoi() will only be re-issuing a
> D, which I think the GIC can live with (even if issued from a different CPU).
> 
> For eoimode=0, it is more troubling, as we break the aforementioned
> restrictions. That said, IIUC this cannot cause e.g. a bogus running
> priority reduction due to the tasklet context mentioned above (running
> priority ought to be idle priority).
> 
> Linux hosts will almost always use eoimode=1, so that leaves us with
> guests running with eoimode=0, and I don't know how common it is (if at all
> possible) for those to use pm / wakeup IRQs. I suppose that is a reason
> why this hasn't cropped up before, that or I miserably misunderstood the
> whole thing.
> 
> In any case, the virtual interface follows the same restrictions wrt
> PD ordering:
> 
> IHI0069D 8.3.7 ICV_EOIR0_EL1, Interrupt Controller Virtual End Of Interrupt Register 0
> """
> A write to this register must correspond to the most recent valid read by
> this vPE from a Virtual Interrupt Acknowledge Register, and must correspond
> to the INTID that was read from ICV_IAR0_EL1, otherwise the system behavior
> is UNPREDICTABLE.
> """
> 
> Change
> ======
> 
> One way to ensure we only get one PD per interrupt activation and maintain
> the expected ordering is to add the IRQCHIP_EOI_IF_HANDLED flag to all
> irq-gic chips, but that only really works for eoimode=1; for eoimode=0 that
> would mean leaving the flow handler without having issued a PD at all.
> 
> At the same time, this whole IRQS_PENDING & resend thing feels like
> something we can handle in hardware: we can leverage
> irq_chip.irq_retrigger() and use this to mark the interrupt as pending in
> the GIC itself, in which case we *don't* want to have
> IRQCHIP_EOI_IF_HANDLED (as the retrigger will lead to another ack+eoi
> pair).
> 
> Implement irq_chip.irq_retrigger() for both GICs.

Although I am very grateful for the whole documentation, I'd rather
have a slightly more condensed changelog that documents the
implementation of the retrigger callback! ;-)

> 
> Signed-off-by: Valentin Schneider <valentin.schneider@....com>
> ---
>  drivers/irqchip/irq-gic-v3.c | 7 +++++++
>  drivers/irqchip/irq-gic.c    | 6 ++++++
>  2 files changed, 13 insertions(+)
> 
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index cc46bc2d634b..c025e8b51464 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -1207,6 +1207,11 @@ static int gic_set_affinity(struct irq_data *d, const struct cpumask *mask_val,
>  #define gic_smp_init()		do { } while(0)
>  #endif
>  
> +static int gic_retrigger(struct irq_data *data)
> +{
> +	return gic_irq_set_irqchip_state(data, IRQCHIP_STATE_PENDING, true);

If I'm not mistaken, check_irq_resend() requires a non-zero return
value if the retrigger has succeeded. So something like

      return !gic_irq_set_irqchip_state(data, IRQCHIP_STATE_PENDING, true);

would be more appropriate.

Otherwise, looks good.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.