[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <MWHPR21MB15930BECFDF0251D36CFBB98D7A69@MWHPR21MB1593.namprd21.prod.outlook.com>
Date: Sat, 16 Jan 2021 22:48:04 +0000
From: Michael Kelley <mikelley@...rosoft.com>
To: Dexuan Cui <decui@...rosoft.com>,
"wei.liu@...nel.org" <wei.liu@...nel.org>,
Haiyang Zhang <haiyangz@...rosoft.com>,
"hpa@...or.com" <hpa@...or.com>, KY Srinivasan <kys@...rosoft.com>,
"linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Stephen Hemminger <sthemmin@...rosoft.com>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"x86@...nel.org" <x86@...nel.org>, vkuznets <vkuznets@...hat.com>,
"bp@...en8.de" <bp@...en8.de>
CC: "ohering@...e.com" <ohering@...e.com>,
"jwiesner@...e.com" <jwiesner@...e.com>,
"marcelo.cerri@...onical.com" <marcelo.cerri@...onical.com>
Subject: RE: [PATCH] x86/hyperv: Initialize clockevents after LAPIC is
initialized
From: Dexuan Cui <decui@...rosoft.com> Sent: Saturday, January 16, 2021 2:32 PM
>
> With commit 4df4cb9e99f8, the Hyper-V direct-mode STIMER is actually
> initialized before LAPIC is initialized: see
>
> apic_intr_mode_init()
>
> x86_platform.apic_post_init()
> hyperv_init()
> hv_stimer_alloc()
>
> apic_bsp_setup()
> setup_local_APIC()
>
> setup_local_APIC() temporarily disables LAPIC, initializes it and
> re-eanble it. The direct-mode STIMER depends on LAPIC, and when it's
> registered, it can be programmed immediately and the timer can fire
> very soon:
>
> hv_stimer_init
> clockevents_config_and_register
> clockevents_register_device
> tick_check_new_device
> tick_setup_device
> tick_setup_periodic(), tick_setup_oneshot()
> clockevents_program_event
>
> When the timer fires in the hypervisor, if the LAPIC is in the
> disabled state, new versions of Hyper-V ignore the event and don't inject
> the timer interrupt into the VM, and hence the VM hangs when it boots.
>
> Note: when the VM starts/reboots, the LAPIC is pre-enabled by the
> firmware, so the window of LAPIC being temporarily disabled is pretty
> small, and the issue can only happen once out of 100~200 reboots for
> a 40-vCPU VM on one dev host, and on another host the issue doesn't
> reproduce after 2000 reboots.
>
> The issue is more noticeable for kdump/kexec, because the LAPIC is
> disabled by the first kernel, and stays disabled until the kdump/kexec
> kernel enables it. This is especially an issue to a Generation-2 VM
> (for which Hyper-V doesn't emulate the PIT timer) when CONFIG_HZ=1000
> (rather than CONFIG_HZ=250) is used.
>
> Fix the issue by moving hv_stimer_alloc() to a later place where the
> LAPIC timer is initialized.
>
> Fixes: 4df4cb9e99f8 ("x86/hyperv: Initialize clockevents earlier in CPU onlining")
> Signed-off-by: Dexuan Cui <decui@...rosoft.com>
> ---
> arch/x86/hyperv/hv_init.c | 29 ++++++++++++++++++++++++++---
> 1 file changed, 26 insertions(+), 3 deletions(-)
>
Reviewed-by: Michael Kelley <mikelley@...rosoft.com>
Powered by blists - more mailing lists