[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <81f5d521-bc8a-4d1a-fe7e-55487f3d25b3@huawei.com>
Date: Thu, 23 Feb 2023 10:29:33 +0800
From: Zeng Heng <zengheng4@...wei.com>
To: "Eric W. Biederman" <ebiederm@...ssion.com>,
Peter Zijlstra <peterz@...radead.org>
CC: <alexander.shishkin@...ux.intel.com>, <tglx@...utronix.de>,
<tiwai@...e.de>, <jolsa@...nel.org>, <vbabka@...e.cz>,
<keescook@...omium.org>, <mingo@...hat.com>, <acme@...nel.org>,
<namhyung@...nel.org>, <bp@...en8.de>, <bhe@...hat.com>,
<eric.devolder@...cle.com>, <hpa@...or.com>, <jroedel@...e.de>,
<dave.hansen@...ux.intel.com>, <linux-perf-users@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <liwei391@...wei.com>,
<x86@...nel.org>, <xiexiuqi@...wei.com>, <liaochang1@...wei.com>
Subject: Re: [RFC PATCH v4] x86/kdump: terminate watchdog NMI interrupt to
avoid kdump crashes
在 2023/2/23 2:39, Eric W. Biederman 写道:
> Peter Zijlstra <peterz@...radead.org> writes:
>
>> On Fri, Feb 17, 2023 at 08:06:04PM +0800, Zeng Heng wrote:
>>> If the cpu panics within the NMI interrupt context, there could be
>>> unhandled NMI interrupts in the background which are blocked by processor
>>> until next IRET instruction executes. Since that, it prevents nested
>>> NMI handler execution.
>>>
>>> In case of IRET execution during kdump reboot and no proper NMIs handler
>>> registered at that point (such as during EFI loader)
> EFI loader? kexec on panic is supposed to be kernel to kernel.
> If someone is getting EFI involved that is a bug.
In kdump path, kexec would start purgatory to verify the secondary kernel by
sha256. If verify passed, it would turn the control to EFI loader, and
call the second
kernel to capture the environment as vmcore file.
As the mail said, if panic appears within NMI context, we never exit
from that until
EFI loader handles page fault exception and executes IRET instruction
when exit
from PF. At this moment, processor would allow the blocked NMI interrupt
raise.
>> This kills all of perf, including but not limited to the hardware
>> watchdog. However, it does nothing to external NMI sources like the NMI
>> button found on some HP machines.
>>
>> Still I suppose it is sufficient for the normal case.
> I can't think of one why we don't just leave
> NMIs deliberately disabled
How to just leave NMIs disabled, could you explain it with more details ?
Zeng Heng
> until the crash recover kernel figured out how to enable them safely.
>
Powered by blists - more mailing lists