[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f8e73ed7-f45f-0f5d-9055-486fb83dcd82@linux.alibaba.com>
Date: Thu, 14 Oct 2021 22:18:54 +0800
From: 乱石 <zhangliguang@...ux.alibaba.com>
To: James Morse <james.morse@....com>
Cc: linux-acpi@...r.kernel.org, linux-kernel@...r.kernel.org,
Tony Luck <tony.luck@...el.com>,
linux-arm-kernel@...ts.infradead.org,
Borislav Petkov <bp@...en8.de>, Len Brown <lenb@...nel.org>,
"Rafael J. Wysocki" <rafael@...nel.org>,
huangming@...ux.alibaba.com
Subject: Re: [PATCH V2] ACPI / APEI: restore interrupt before panic in sdei
flow
Hi,
在 2021/10/14 1:44, James Morse 写道:
> Hello!
>
> On 12/10/2021 15:29, Liguang Zhang wrote:
>> When hest acpi table configure Hardware Error Notification type as
>> Software Delegated Exception(0x0B) for RAS event, OS RAS interacts with
>> ATF by SDEI mechanism. On the firmware first system, OS was notified by
>> ATF sdei call.
>>
>> The calling flow like as below when fatal RAS error happens:
>>
>> ATF notify OS flow:
>> sdei_dispatch_event()
>> ehf_activate_priority()
>> call sdei callback // callback registered by OS
>> ehf_deactivate_priority()
>>
>> OS sdei callback:
>> sdei_asm_handler()
>> __sdei_handler()
>> _sdei_handler()
>> sdei_event_handler()
>> ghes_sdei_critical_callback()
>> ghes_in_nmi_queue_one_entry()
>> /* if RAS error is fatal */
>> __ghes_panic()
>> panic()
>>
>> If fatal RAS error occured, panic was called in sdei_asm_handle()
>> without ehf_deactivate_priority executed, which lead interrupt masked.
> So far the story is:
> Firmware generated and SDEI event (a kind of software NMI) because of a firmware
> interrupt, but it hasn't completely handled the interrupt.
>
>
>> If interrupt masked, system would be halted in kdump flow like this:
>>
>> arm-smmu-v3 arm-smmu-v3.3.auto: allocated 65536 entries for cmdq
>> arm-smmu-v3 arm-smmu-v3.3.auto: allocated 32768 entries for evtq
>> arm-smmu-v3 arm-smmu-v3.3.auto: allocated 65536 entries for priq
>> arm-smmu-v3 arm-smmu-v3.3.auto: SMMU currently enabled! Resetting...
> How and why do firmware interrupts affect the IOMMU?
>
> It sounds like you are sharing something with firmware that you shouldn't.
>
>
>> After debug, we found accurate halted position is:
>> arm_smmu_device_probe()
>> arm_smmu_device_reset()
>> arm_smmu_device_disable()
>> arm_smmu_write_reg_sync()
>> readl_relaxed_poll_timeout()
>> readx_poll_timeout()
>> read_poll_timeout()
>> usleep_range() // hrtimer is never waked.
>>
>> So interrupt should be restored before panic otherwise kdump will trigger
>> error.
> Why can't firmware finish with the interrupt before injecting the SDEI event?
> If you need it to not happen a second time while the handler runs, you can always disable it.
>
> The text in the spec about the interaction of complete and physical interrupts is for
> bound interrupts. Linux doesn't support these. It isn't possible for linux to know whether
> firmware tied any other kind of event to a physical interrupt or not.
>
>
>> In the process of sdei, a SDEI_EVENT_COMPLETE_AND_RESUME call
>> should be called before panic for a completed run of ehf_deactivate_priority().
> SDEI_EVENT_COMPLETE_AND_RESUME is a complete, it tells firmware to restore the execution
> state from before the event. You get almost get away with x17-x30 being corrupted as
> panic() won't return - but the stack trace produced will be corrupt. If the original
> exception was from user-space, SP_EL0 will have been restored to be the user value. The
> kernel uses this for 'current'.
>
>
> The way this is supposed to work is the die-ing kernel calls SDEI_PE_MASK while it does
> the kdump reboot. Once the kdump kernel has started, the SDEI_PRIVATE_RESET and
> SDEI_SHARED_RESET calls should fix anything left over in firmware.
>
>
> Could you debug why firmware interrupts being active prevent the SMMU from being reset. As
> far as I can tell, those should be totally independent.
If ehf_deactivate_priority() was not executed, pmr_el1 register was not
resumed to >0x80, which leads
non-secure interrupts masked. arm_smmu_device_probe() finally called
usleep_range() which based on
hrtimer. Because non-secure timer interrupts was masked, usleep_range
would not reponse.
Thanks.
Liguang
>
>
> Thanks,
>
> James
Powered by blists - more mailing lists