linux-kernel - Re: [PATCH] irqchip/gicv3-its: Workaround for GIC-700 erratum 2195890

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <86sex1hz78.wl-maz@kernel.org>
Date: Tue, 25 Jun 2024 15:30:35 +0100
From: Marc Zyngier <maz@...nel.org>
To: Roman Kagan <rkagan@...zon.de>,
	Marc Zyngier <maz@...nel.org>,
	linux-arm-kernel@...ts.infradead.org,
	Catalin Marinas <catalin.marinas@....com>,
	Will Deacon <will@...nel.org>,
	nh-open-source@...zon.com,
	linux-doc@...r.kernel.org,
	linux-kernel@...r.kernel.org,
	Thomas Gleixner <tglx@...utronix.de>,
	Jonathan Corbet <corbet@....net>
Subject: Re: [PATCH] irqchip/gicv3-its: Workaround for GIC-700 erratum 2195890

On Tue, 25 Jun 2024 14:54:28 +0100,
Roman Kagan <rkagan@...zon.de> wrote:
> 
> On Tue, Jun 25, 2024 at 09:45:22AM +0100, Marc Zyngier wrote:
> > On Mon, 24 Jun 2024 17:55:41 +0100,
> > Roman Kagan <rkagan@...zon.de> wrote:
> > >
> > > According to Arm CoreLink GIC-700 erratum 2195890, on GIC revisions
> > > r0p0, r0p1, r1p0 under certain conditions LPIs may remain in the Pending
> > > Table until one of a number of external events occurs.
> > 
> > Please add a link to the errata document.
> 
> https://developer.arm.com/documentation/SDEN-1769194/
> Will include when respinning.
> 
> > >
> > > No LPIs are lost but they may not be delivered in a finite time.
> > >
> > > The workaround is to issue an INV using GICR_INVLPIR to an unused, in
> > > range LPI ID to retrigger the search.
> > >
> > > Add this workaround to the quirk table.  When the quirk is applicable,
> > > carve out one LPI ID from the available range and run periodic work to
> > > do INV to it, in order to prevent GIC from stalling.
> > 
> > The errata document says a lot more:
> > 
> > <quote>
> > For physical LPIs the workaround is to issue an INV using GICR_INVLPIR
> > to an unused, in range LPI ID to retrigger the search. This could be
> > done periodically, for example, in line with a residency change, or as
> > part of servicing LPIs.  If using LPIs as the event, then the
> > GICR_INVLPIR write could be issued after servicing every LPI.
> > 
> > However, it only needs to be issued if:
> > 
> > * At least 4 interrupts in the block of 32 are enabled and mapped to
> >   the current PE or, if easier,
> > 
> > * At least 4 interrupts in the block of 32 are enabled and mapped to
> >   any PE
> > </quote>
> 
> It didn't feel like worth optimizing for.  I'll reconsider.

I'm not sure we want to optimise for it, but I'd certainly want to
hear the *rationale* behind not considering the optimisation.

> 
> > > TT: https://t.corp.amazon.com/D82032616
> > 
> > Gniii????
> 
> Indeed Q-/
> 
> > > Signed-off-by: Elad Rosner <eladros@...zon.com>
> > > Signed-off-by: Mohamed Mediouni <mediou@...zon.com>
> > > Signed-off-by: Roman Kagan <rkagan@...zon.de>
> > 
> > Who is the author?
> 
> Joint effort aka inherited ownership.  Will fix according to the
> process doc.
> 
> > > +static void __maybe_unused its_quirk_gic700_2195890_work_handler(struct work_struct *work)
> > > +{
> > > +     int cpu;
> > > +     void __iomem *rdbase;
> > > +     u64 gicr_invlpir_val;
> > > +
> > > +     for_each_online_cpu(cpu) {
> > 
> > The errata document doesn't say that this need to happen for *every*
> > RD. Can you please clarify this?
> 
> (Digging out a year-old comms with ARM)
> > > In multi-chip GIC system, does this write have to happen in each
> > > chip or would a write to a single GICR trigger the search in all
> > > GICDs?
> > The write needs to occur for each physical PE - in other words, to
> > each individual GICR that the search needs to be re-triggered for.

OK, that pretty much rules out doing anything clever (note to self,
check the GIC revision before buying the HW...).

> 
> > > +             raw_spin_lock(&gic_data_rdist_cpu(cpu)->rd_lock);
> > > +             gic_write_lpir(gicr_invlpir_val, rdbase + GICR_INVLPIR);
> > > +             raw_spin_unlock(&gic_data_rdist_cpu(cpu)->rd_lock);
> > 
> > No synchronisation? How is that supposed to work?
> > 
> > Also, if you need to dig into the internals of the driver, extract a
> > helper from __direct_lpi_inv().
> 
> ACK
> 
> > > +     }
> > > +
> > > +     schedule_delayed_work(&its_quirk_gic700_2195890_data.work,
> > > +             msecs_to_jiffies(ITS_QUIRK_GIC700_2195890_PERIOD_MSEC));
> > 
> > It would be pretty easy to detect whether an LPI was ack'ed since the
> > last pass, and not issue the invalidate.
> 
> Makes sense, will look into it.
> 
> Overall, do you think this approach with a global work looping over cpus
> is the right one, or we should better try and implement something
> per-cpu?

One of my worries is that you're crossing all node boundaries by doing
this, which is going to suck on really large systems. If anything,
you'd be better off with a per-node worker.

It doesn't need to be implemented right now, but I have the feeling
that someone is going to ask.

> 
> > > +}
> > > +
> > > +static bool __maybe_unused its_enable_quirk_gic700_2195890(void *data)
> > > +{
> > > +     struct its_node *its = data;
> > > +
> > > +     if (its_quirk_gic700_2195890_data.lpi)
> > > +             return true;
> > > +
> > > +     /*
> > > +      * Use one LPI INTID from the start of the LPI range for GIC prodding,
> > > +      * and make it unavailable for regular LPI use later.
> > > +      */
> > > +     its_quirk_gic700_2195890_data.lpi = lpi_id_base++;
> > > +
> > > +     INIT_DELAYED_WORK(&its_quirk_gic700_2195890_data.work,
> > > +                       its_quirk_gic700_2195890_work_handler);
> > > +     schedule_delayed_work(&its_quirk_gic700_2195890_data.work, 0);
> > > +
> > > +     return true;
> > > +}
> > 
> > It is a bit odd to hook this on an ITS being probed when the ITS isn't
> > really involved. Not a big deal, but a bit clumsy.
> 
> True, but the LPI allocation lives in this file so it looked easier to
> wire it all up here.  Where do you think it's more appropriate?

But the allocation doesn't really take place, does it? You just nick
one LPI. Which by the way I'd rather you pick the last one instead of
the first, as this messes with devices that require ye oldie MultiMSI
and its stupid alignment requirements.

Also, you have its_cpu_init() which takes care of RDs. You could
absolutely add the quirk to the main GIC driver (based on the
distributor IIDR), add a flag to rdists.flags, and let it roll.

Either way, I don't really care. Maybe keeping it centralised is good
enough.

> 
> > >  static const struct gic_quirk its_quirks[] = {
> > >  #ifdef CONFIG_CAVIUM_ERRATUM_22375
> > >       {
> > > @@ -4822,6 +4879,17 @@ static const struct gic_quirk its_quirks[] = {
> > >               .property = "dma-noncoherent",
> > >               .init   = its_set_non_coherent,
> > >       },
> > > +#ifdef CONFIG_ARM64_ERRATUM_2195890
> > > +     {
> > > +             .desc   = "ITS: GIC-700 erratum 2195890",
> > > +             /*
> > > +              * Applies to r0p0, r0p1, r1p0: iidr_var(bits 16..19) == 0 or 1
> > > +              */
> > > +             .iidr   = 0x0400043b,
> > > +             .mask   = 0xfffeffff,
> > > +             .init   = its_enable_quirk_gic700_2195890,
> > 
> > This catches r0p0 and r1p0, but not r0p1 (you require that bits 15:12
> > are 0).
> 
> Ouch, right.  Given the erratum exact wording
> 
> > Fault Status: Present in: r0p0, r0p1, r1p0 Fixed in: r2p0
> 
> I guess I should match everything below r2p0 and allow arbitrary bits
> 15:12 (i.e. set the third nibble in the mask to 0).

Either that or you could have two entries. Or use the fact that there
is no released r1p1 to your advantage...

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.