[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <bc177e2d-5839-3bda-a35b-783ad5e1b3df@codeaurora.org>
Date: Fri, 8 May 2020 21:46:01 +0530
From: Neeraj Upadhyay <neeraju@...eaurora.org>
To: Marc Zyngier <maz@...nel.org>
Cc: julien.thierry.kdev@...il.com, linux-kernel@...r.kernel.org
Subject: Re: Query regarding pseudo nmi support on GIC V3 and request_nmi()
Hi Marc,
Thanks a lot for your comments. I will work on exploring how SDEI can be
used for it.
Thanks
Neeraj
On 5/8/2020 9:41 PM, Marc Zyngier wrote:
> On Fri, 08 May 2020 14:34:10 +0100,
> Neeraj Upadhyay <neeraju@...eaurora.org> wrote:
>>
>> Hi Marc,
>>
>> On 5/8/2020 6:23 PM, Marc Zyngier wrote:
>>> On Fri, 8 May 2020 18:09:00 +0530
>>> Neeraj Upadhyay <neeraju@...eaurora.org> wrote:
>>>
>>>> Hi Marc,
>>>>
>>>> On 5/8/2020 5:57 PM, Marc Zyngier wrote:
>>>>> On Fri, 8 May 2020 16:36:42 +0530
>>>>> Neeraj Upadhyay <neeraju@...eaurora.org> wrote:
>>>>>
>>>>>> Hi Marc,
>>>>>>
>>>>>> On 5/8/2020 4:15 PM, Marc Zyngier wrote:
>>>>>>> On Thu, 07 May 2020 17:06:19 +0100,
>>>>>>> Neeraj Upadhyay <neeraju@...eaurora.org> wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I have one query regarding pseudo NMI support on GIC v3; from what I
>>>>>>>> could understand, GIC v3 supports pseudo NMI setup for SPIs and PPIs.
>>>>>>>> However the request_nmi() in irq framework requires NMI to be per cpu
>>>>>>>> interrupt source (it checks for IRQF_PERCPU). Can you please help
>>>>>>>> understand this part, how SPIs can be configured as NMIs, if there is
>>>>>>>> a per cpu interrupt source restriction?
>>>>>>>
>>>>>>> Let me answer your question by another question: what is the semantic
>>>>>>> of a NMI if you can't associate it with a particular CPU?
>>>>>>> >>
>>>>>> I was actually thinking of a use case, where, we have a watchdog
>>>>>> interrupt (which is a SPI), which is used for detecting software
>>>>>> hangs and cause device reset; If that interrupt's current cpu
>>>>>> affinity is on a core, where interrupts are disabled, we won't be
>>>>>> able to serve it; so, we need to group that interrupt as an fiq;
>>>>>
>>>>> Linux doesn't use Group-0 interrupts, as they are strictly secure
>>>>> (unless your SoC doesn't have EL3, which I doubt).
>>>>
>>>> Yes, we handle that watchdog interrupt as a Group-0 interrupt, which
>>>> is handled as fiq in EL3.
>>>>
>>>>>
>>>>>> I was thinking, if its feasible to mark that interrupt as pseudo
>>>>>> NMI and route it to EL1 as irq. However, looks like that is not the
>>>>>> semantic of a NMI and we would need something like pseudo NMI ipi
>>>>>> for this.
>>>>>
>>>>> Sending a NMI IPI from another NMI handler? Even once I've added
>>>>> these, there is no way this will work for that particular scenario.
>>>>> Just look at the restrictions we impose on NMIs.
>>>>>
>>>>
>>>> Sending a pseudo NMI IPI (to EL1) from fiq handler (which runs in
>>>> EL3); I will check, but do you think, that might not work?
>>>
>>> How do you know, from EL3, what to write in memory so that the NMI
>>> handler knows what you want to do? Are you going to parse the S1 page
>>> tables? Hard-code the behaviour of some random Linux version in your
>>> legendary non-updatable firmware? This isn't an acceptable behaviour.
>>>
>>
>> Ok, I understand;
>>
>> Initial thought was to use watchdog SPI as pseudo NMI; however, as
>> pseudo NMIs are only per CPU sources, we were exploring the
>> possibility of using an unused ipi (using the work which is done in
>> [1] and [2] for SGIs) as pseudo NMI, which EL3 sends to EL1, on
>> receiving watchdog fiq. The pseudo NMI handler would collect required
>> debug information, to help indentify the lockup cause. We weren't
>> thinking of communicating any information from EL3 fiq handler to
>> EL1.
>
> What if the operating system running at EL1/EL2 is *not* Linux?
>
>>
>> However, from this discussion, I realize that calling irq handler from
>> fiq handler, would not be possible. So, the approach looks flawed.
>>
>> I believe, allowing a non-percpu pseudo NMI is not acceptable to community?
>
> No, I really don't want to entertain this idea, because the semantics
> are way too loosely defined and you'd end up with everyone wanting
> something mildly different.
>
>>> An IPI is between two CPUs used by the same SW entitiy. What runs at
>>> EL3 is completely alien to Linux, and so is Linux to EL3. If you want
>>> to IPI, send Group-0 IPIs that are private to the firmware.
>>>
>>
>> Ok got it; however, I wonder what's the use case of sending
>> SGI to EL1, from secure world, using ICC_ASGI1R. I thought it
>> allowed communication between EL1 and EL3; but, looks like I
>> understood in wrong.
>
> There is what the GIC architecture can do, and there is what is
> sensible for Linux. The GIC allows IPIs from S-to-NS as well as the
> opposite. This doesn't make it a good idea (it actually is a terrible
> idea, and I really hope that future versions of the architecture will
> simply kill the feature).
>
> The idea was that you'd make SGIs an first class ABI between S and
> NS. Given that the two are developed separately and that nobody ever
> standardised what the SGI numbers mean, this idea is completely dead.
>
>>
>>> If you want to inject NMI-type exceptions into EL1, you can always try
>>> SDEI (did I actually write this? Help!). But given your use case below,
>>> that wouldn't work either.
>>>
>>
>> Ok.
>>
>>>>> Frankly, if all you need to do is to reset the SoC, use EL3
>>>>> firmware. That is what it is for.
>>>>>
>>>>
>>>> Before triggering SoC reset, we want to collect certain EL1 debug
>>>> information like stack trace for CPUs and other debug information.
>>>
>>> Frankly, if you are going to reset the SoC because EL1/EL2 has gone
>>> bust, how can you trust it to do anything sensible when injecting an
>>> interrupt?. Once you take a SPI at EL3, gather whatever state you want
>>> from EL3. Do not involve EL1 at all.
>>>
>>> M.
>>>
>>
>> Agree that it might not work for all cases. But, for the cases like,
>> some kernel code is stuck after disabling local irqs; pseudo NMI might
>> still be able to run and capture stack and other debug info, to help
>> detect the cause of lockups.
>
> And for that we'll have pseudo-NMI IPIs, initiated from the kernel
> itself as part of the normal debugging infrastructure. It is the EL3
> initiated IPI to EL1 that I strongly oppose against. Not to mention
> that if the kernel locks up with PSTATE.I set (which still happens on
> exception entry), the pseudo-NMI won't work either.
>
> As I said, you only have two options: either implement everything in
> EL3 (and the NS OS doesn't need to know anything at all), or use SDEI
> as the architected way to inject an exception into the NS world (and
> Linux already supports it).
>
> M.
>
--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of the Code Aurora Forum, hosted by The Linux Foundation
Powered by blists - more mailing lists