[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87o7tfutv5.wl-maz@kernel.org>
Date: Thu, 10 Nov 2022 07:54:54 +0000
From: Marc Zyngier <maz@...nel.org>
To: Mukesh Ojha <quic_mojha@...cinc.com>
Cc: <linux-arm-kernel@...ts.infradead.org>, <catalin.marinas@....com>,
<will@...nel.org>, Thomas Gleixner <tglx@...utronix.de>,
lkml <linux-kernel@...r.kernel.org>
Subject: Re: Query on handling some special Group0 interrupt in Linux
On Wed, 09 Nov 2022 19:57:24 +0000,
Mukesh Ojha <quic_mojha@...cinc.com> wrote:
>
> Hi Marc,
>
> Thanks for your reply.
>
> On 11/9/2022 11:50 PM, Marc Zyngier wrote:
> > On Wed, 09 Nov 2022 16:20:35 +0000,
> > Mukesh Ojha <quic_mojha@...cinc.com> wrote:
> >>
> >> Hi,
> >>
> >> I was working on a use case where both el2/el3 are implemented and we
> >> have a watchdog interrupt (SPI), which is used for detecting software
> >> hangs and cause device reset; If that interrupt's current cpu affinity
> >> is on a core, where interrupts are disabled, we won't be able to serve
> >> it or if this interrupt comes on a core which has interrupt enabled,
> >> calling panic() or with smp_send_stop(), we would not be able
> >> to know the call stack of the other cores which is running with
> >> interrupt disabled.
> >>
> >> I was thinking of configuring both a watchdog irq(SPI) and IPI_STOP
> >> (SGI) or any reserve IPI as an FIQ. And from the watchdog irq handler,
> >> I was thinking of calling panic() which eventually sends IPI_STOP(SGI
> >> FIQ) to all the cores. And with this we will able to dump all the core
> >> call stack.
> >>
> >> I am able to achieve this but wanted to know if this is acceptable to
> >> the community to support/allow such use cases like above and enable
> >> group0 interrupt from GIC for some special use cases.
> >
> > For a start, we only deal with Group-1 interrupts in Linux. Group-0
> > interrupts are for the firmware, and we really don't want to see them
> > (this is consistent with your HW having EL3).
>
> What is the downside of it we support this ? I see one of the
> implementation here.
>
> https://elixir.bootlin.com/linux/v6.0.7/source/drivers/irqchip/irq-apple-aic.c#L510
You do realise that this system doesn't even have a GIC, and only uses
FIQ to represent per-CPU interrupts, right?
>
> > We also mask IRQ and FIQ at the same time, so this is a non-starter.
> This can be taken care if we support this.
No. We've made the decision not to treat IRQ and FIQ differently,
because FIQ only matters for systems with a single security domain
such as VMs or wonky systems such as the above. With that, all systems
behave the same and are treated the same, making the rules for
interrupt preemption understandable and we don't have to think of IRQ
and FIQ racing with each other.
>
> >
> > If you want to be able to deliver an interrupt while the interrupts
> > are masked, what you are looking for is the NMI framework, for which
> > you can register SPIs as (pseudo-)NMI.
>
> Yes, kind of NMI. I have already looked into this. Since, in our
> system El2 is implemented and each physical interrupt get routed to
> hypervisor and later vIrq comes to El1 and each interrupt
> enable/disable call exercise pmr register trap can cause latency in
> regular run(like multiple VM).
Then your hypervisor needs fixing. There is no need to trap accesses
to PMR. Also, PMR being per-CPU, there should be no extra overhead
depending on the number of VM even if you were trapping PMR (for
example to work around broken HW).
To sum it up, none of the above makes much sense to me.
> Since, some of the use-case could be special like i have mentioned
> in my initial mail where such interrupt will be fatal and system will
> get reset after that. I am not able to think of any other use case than
> this but can this not be considered as one of the feature.
Well, we don't add stuff to the kernel based on idle considerations,
and what you are describing so far matches 100% the requirement for an
NMI-like feature.
The architecture has two ways to implement almost-NMIs: interrupt
priorities (our current crop of pseudo-NMIs) and the ARMv8.8
FEAT_NMI. The former is already there, and there are patches on the
list for the latter.
Do we need a third way that only works for odd corner cases and that
adds a huge amount of complexity? No, thank you.
M.
--
Without deviation from the norm, progress is not possible.
Powered by blists - more mailing lists