[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b5a1370c-1319-24d1-6b2a-629e5c8915ed@igalia.com>
Date: Mon, 9 May 2022 09:32:27 -0300
From: "Guilherme G. Piccoli" <gpiccoli@...lia.com>
To: Paolo Bonzini <pbonzini@...hat.com>,
Sean Christopherson <seanjc@...gle.com>, vkuznets@...hat.com
Cc: kexec@...ts.infradead.org, pmladek@...e.com, bhe@...hat.com,
akpm@...ux-foundation.org, linux-kernel@...r.kernel.org,
bcm-kernel-feedback-list@...adcom.com, coresight@...ts.linaro.org,
linuxppc-dev@...ts.ozlabs.org, linux-alpha@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, linux-edac@...r.kernel.org,
linux-hyperv@...r.kernel.org, linux-leds@...r.kernel.org,
linux-mips@...r.kernel.org, linux-parisc@...r.kernel.org,
linux-pm@...r.kernel.org, linux-remoteproc@...r.kernel.org,
linux-s390@...r.kernel.org, linux-tegra@...r.kernel.org,
linux-um@...ts.infradead.org, linux-xtensa@...ux-xtensa.org,
netdev@...r.kernel.org, openipmi-developer@...ts.sourceforge.net,
rcu@...r.kernel.org, sparclinux@...r.kernel.org,
xen-devel@...ts.xenproject.org, x86@...nel.org,
kernel-dev@...lia.com, kernel@...ccoli.net, halves@...onical.com,
fabiomirmar@...il.com, alejandro.j.jimenez@...cle.com,
andriy.shevchenko@...ux.intel.com, arnd@...db.de, bp@...en8.de,
corbet@....net, d.hatayama@...fujitsu.com,
dave.hansen@...ux.intel.com, dyoung@...hat.com,
feng.tang@...el.com, gregkh@...uxfoundation.org,
mikelley@...rosoft.com, hidehiro.kawai.ez@...achi.com,
jgross@...e.com, john.ogness@...utronix.de, keescook@...omium.org,
luto@...nel.org, mhiramat@...nel.org, mingo@...hat.com,
paulmck@...nel.org, peterz@...radead.org, rostedt@...dmis.org,
senozhatsky@...omium.org, stern@...land.harvard.edu,
tglx@...utronix.de, vgoyal@...hat.com, will@...nel.org,
"David P . Reed" <dpreed@...pplum.com>
Subject: Re: [PATCH 01/30] x86/crash,reboot: Avoid re-disabling VMX in all
CPUs on crash/restart
On 27/04/2022 19:48, Guilherme G. Piccoli wrote:
> In the panic path we have a list of functions to be called, the panic
> notifiers - such callbacks perform various actions in the machine's
> last breath, and sometimes users want them to run before kdump. We
> have the parameter "crash_kexec_post_notifiers" for that. When such
> parameter is used, the function "crash_smp_send_stop()" is executed
> to poweroff all secondary CPUs through the NMI-shootdown mechanism;
> part of this process involves disabling virtualization features in
> all CPUs (except the main one).
>
> Now, in the emergency restart procedure we have also a way of
> disabling VMX in all CPUs, using the same NMI-shootdown mechanism;
> what happens though is that in case we already NMI-disabled all CPUs,
> the emergency restart fails due to a second addition of the same items
> in the NMI list, as per the following log output:
>
> sysrq: Trigger a crash
> Kernel panic - not syncing: sysrq triggered crash
> [...]
> Rebooting in 2 seconds..
> list_add double add: new=<addr1>, prev=<addr2>, next=<addr1>.
> ------------[ cut here ]------------
> kernel BUG at lib/list_debug.c:29!
> invalid opcode: 0000 [#1] PREEMPT SMP PTI
>
> In order to reproduce the problem, users just need to set the kernel
> parameter "crash_kexec_post_notifiers" *without* kdump set in any
> system with the VMX feature present.
>
> Since there is no benefit in re-disabling VMX in all CPUs in case
> it was already done, this patch prevents that by guarding the restart
> routine against doubly issuing NMIs unnecessarily. Notice we still
> need to disable VMX locally in the emergency restart.
>
> Fixes: ed72736183c4 ("x86/reboot: Force all cpus to exit VMX root if VMX is supported)
> Fixes: 0ee59413c967 ("x86/panic: replace smp_send_stop() with kdump friendly version in panic path")
> Cc: David P. Reed <dpreed@...pplum.com>
> Cc: Hidehiro Kawai <hidehiro.kawai.ez@...achi.com>
> Cc: Paolo Bonzini <pbonzini@...hat.com>
> Cc: Sean Christopherson <seanjc@...gle.com>
> Signed-off-by: Guilherme G. Piccoli <gpiccoli@...lia.com>
> ---
> arch/x86/include/asm/cpu.h | 1 +
> arch/x86/kernel/crash.c | 8 ++++----
> arch/x86/kernel/reboot.c | 14 ++++++++++++--
> 3 files changed, 17 insertions(+), 6 deletions(-)
>
Hi Paolo / Sean / Vitaly, sorry for the ping.
But do you think this fix is OK from the VMX point-of-view?
I'd like to send a V2 of this set soon, so any review here is highly
appreciated!
Cheers,
Guilherme
Powered by blists - more mailing lists