lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251021154052.17132-1-fuqiang.wng@gmail.com>
Date: Tue, 21 Oct 2025 23:40:50 +0800
From: fuqiang wang <fuqiang.wng@...il.com>
To: Sean Christopherson <seanjc@...gle.com>,
	Paolo Bonzini <pbonzini@...hat.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	Borislav Petkov <bp@...en8.de>,
	Dave Hansen <dave.hansen@...ux.intel.com>,
	x86@...nel.org,
	"H . Peter Anvin" <hpa@...or.com>,
	Maxim Levitsky <mlevitsk@...hat.com>,
	kvm@...r.kernel.org,
	linux-kernel@...r.kernel.org
Cc: fuqiang wang <fuqiang.wng@...il.com>,
	wangfuqiang49 <wangfuqiang49@...com>
Subject: [PATCH 0/2] KVM: x86: fix some kvm period timer BUG

From: wangfuqiang49 <wangfuqiang49@...com>

The following two patches fix some bugs in the x86 KVM periodic timer.

=======
patch 1
=======

The first patch fixes a problem where, if the next period has already
expired, e.g. due to the period being smaller than the delay in processing
the timer, the hv timer will switch to the software timer and never switch
back. 

The reproduction steps and patch verification results at link [1].

=======
patch 2
=======

The second patch fixes an issue where, if the first patch has not been
applied, resuming a virtual machine after it has been suspended for a long
time may trigger a hard lockup. Link [2] also talks about this issue.

Link [2] also talks about this issue, but I don’t think it can actually
reproduce the problem. Because of commit [3], as long as the KVM timer is
running, target_expiration will keep catching up to now (unless every
single delay from timer virtualization is longer than the period, which is
a pretty extreme case). Also, patch 2 is based on the changes in link [2],
but with some differences: In link [2], target_expiration is updated to
"now - period", and I’m not sure why it doesn’t just catch up to now—maybe
I’m missing something? In patch 2, I tentatively set target_expiration to
catch up to now.

The reproduction steps and patch verification results at link [4].

=============================================================
other questions -- Should the two patches be merged into one?
=============================================================

If we end up making target_expiration catch up to now, patch 1 and patch 2
could probably be combined, since we wouldn’t need the delta > 0 check
anymore. But keeping them separate helps clearly show the two different
problems we’re fixing. 

However, the hardlockup issue only occurs when patch 1 is not merged, which
leads to the commit message for the second patch repeatedly mentioning
"...previous patch is merged". I’m not sure if this is appropriate, so I
would like to hear your suggestions.

[1]: https://github.com/cai-fuqiang/kernel_test/tree/master/period_timer_test
[2]: https://lore.kernel.org/kvm/YgahsSubOgFtyorl@fuller.cnet/
[3]: d8f2f498d9ed ("x86/kvm: fix LAPIC timer drift when guest uses periodic mode")
[4]: https://github.com/cai-fuqiang/md/tree/master/case/intel_kvm_period_timer

fuqiang wang (2):
  avoid hv timer fallback to sw timer if delay exceeds period
  fix hardlockup when waking VM after long suspend

 arch/x86/kvm/lapic.c | 30 +++++++++++++++++++++++-------
 1 file changed, 23 insertions(+), 7 deletions(-)

-- 
2.47.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ