[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251105135340.33335-1-fuqiang.wng@gmail.com>
Date: Wed, 5 Nov 2025 21:53:37 +0800
From: fuqiang wang <fuqiang.wng@...il.com>
To: Sean Christopherson <seanjc@...gle.com>,
Paolo Bonzini <pbonzini@...hat.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>,
x86@...nel.org,
Marcelo Tosatti <mtosatti@...hat.com>,
"H . Peter Anvin" <hpa@...or.com>,
Maxim Levitsky <mlevitsk@...hat.com>,
kvm@...r.kernel.org,
linux-kernel@...r.kernel.org
Cc: fuqiang wang <fuqiang.wng@...il.com>,
yu chen <yuchen33988979@....com>,
dongxu zhang <dongxuzhangxu910121@...a.com>
Subject: [PATCH v4 0/1] KVM: x86: fix some kvm period timer BUG
This patch fixes two issues with the period timer:
=======
issue 1
=======
If the next period has already expired, e.g. due to the period being
smaller than the delay in processing the timer, the hv timer will switch to
the software timer and never switch back.
=======
issue 2
=======
Resuming a virtual machine after it has been suspended for a long time may
trigger a hard lockup.
Marcelo also talks about this issue in link [2], but I don't think it can
actually reproduce the problem. Because of commit [3], as long as the KVM
timer is running, target_expiration will keep catching up to now (unless
every single delay from timer virtualization is longer than the period,
which is a pretty extreme case). Also, this patch is based on the patch of
link [2], but with some differences: In link [2], target_expiration is
updated to "now - period"(I'm not sure why it doesn't just catch up to now
-- maybe I'm missing something?). In this patch, I set target_expiration to
catch up to now.
=========================================================================
other questions -- Should the two issues be fixed together or separately?
=========================================================================
In the v3 version, I split it into two patches, but since in this patch, if
it is found that the advanced target_expiration is still less than the
current time, target_expiration is updated to now.(If target_expiration is
updated to 'now - period', splitting it into two patches would look better)
This would cause a reversion of the code in patch 1, so in this version of
the patch, the two patches are merged into one.
But keeping them separate helps clearly show the two different problems
we're fixing. So, I still don’t know what the best approach is. Please give
me some advice.
Changes in v4:
- merge two patch into one
Changes in v3:
- Fix: advanced SW timer (hrtimer) expiration does not catch up to current
time.
- optimize the commit message of patch 2
- link to v2: https://lore.kernel.org/all/20251021154052.17132-1-fuqiang.wng@gmail.com/
Changes in v2:
- Added a bugfix for hardlockup in v2
- link to v1: https://lore.kernel.org/all/20251013125117.87739-1-fuqiang.wng@gmail.com/
[1]: https://github.com/cai-fuqiang/kernel_test/tree/master/period_timer_test
[2]: https://lore.kernel.org/kvm/YgahsSubOgFtyorl@fuller.cnet/
[3]: d8f2f498d9ed ("x86/kvm: fix LAPIC timer drift when guest uses periodic mode")
[4]: https://github.com/cai-fuqiang/md/tree/master/case/intel_kvm_period_timer
fuqiang wang (1):
fix hardlockup when waking VM after long suspend
arch/x86/kvm/lapic.c | 32 ++++++++++++++++++++++++--------
1 file changed, 24 insertions(+), 8 deletions(-)
--
2.47.0
Powered by blists - more mailing lists