lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aPJnxDj4mFSJc0tV@google.com>
Date: Fri, 17 Oct 2025 08:59:00 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: fuqiang wang <fuqiang.wng@...il.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, Thomas Gleixner <tglx@...utronix.de>, 
	Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>, 
	Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org, 
	"H . Peter Anvin" <hpa@...or.com>, Maxim Levitsky <mlevitsk@...hat.com>, kvm@...r.kernel.org, 
	linux-kernel@...r.kernel.org, yu chen <chen.yu@...ystack.com>, 
	dongxu zhang <dongxu.zhang@...ystack.com>
Subject: Re: [PATCH RESEND] avoid hv timer fallback to sw timer if delay
 exceeds period

On Fri, Oct 17, 2025, fuqiang wang wrote:
> On 10/14/25 7:29 AM, Sean Christopherson wrote:
> > On Wed, Oct 01, 2025, fuqiang wang wrote:
> > The only code that cares is __kvm_wait_lapic_expire(), and the only downside to
> > setting tscdeadline=L1.TSC is that adjust_lapic_timer_advance() won't adjust as
> > aggressively as it probably should.
> 
> I am not sure which type of timers should use the "advanced tscdeadline hrtimer
> expiration feature".
> 
> I list the history of this feature.
> 
> 1. Marcelo first introduce this feature, only support the tscdeadline sw timer.
> 2. Yunhong introduce vmx preemption timer(hv), only support for tscdeadline.
> 3. Liwanpeng extend the hv timer to oneshot and period timers.
> 4. Liwanpeng extend this feature to hv timer.
> 5. Sean and liwanpeng fix some BUG extend this feature to hv period/oneshot timer.
> 
> [1] d0659d946be0("KVM: x86: add option to advance tscdeadline hrtimer expiration")
>     Marcelo Tosatti     Dec 16 2014
> [2] ce7a058a2117("KVM: x86: support using the vmx preemption timer for tsc deadline timer")
>     Yunhong Jiang       Jun 13 2016
> [3] 8003c9ae204e("KVM: LAPIC: add APIC Timer periodic/oneshot mode VMX preemption timer support")
>     liwanpeng           Oct 24 2016
> [4] c5ce8235cffa("KVM: VMX: Optimize tscdeadline timer latency")
>     liwanpeng           May 29 2018
> [5] ee66e453db13("KVM: lapic: Busy wait for timer to expire when using hv_timer")
>     Sean Christopherson Apr 16 2019
> 
>     d981dd15498b("KVM: LAPIC: Accurately guarantee busy wait for timer to expire when using hv_timer")
>     liwanpeng           Apr 28 2021
> 
> Now, timers supported for this feature includes:
> - sw: tscdeadline
> - hv: all (tscdeadline, oneshot, period)
> 
> ====
> IMHO
> ====
> 
> 1. for period timer
> ===================
> 
> I think for periodic timers emulation, the expiration time is already adjusted
> to compensate for the delays introduced by timer emulation, so don't need this
> feature to adjust again. But after use the feature, the first timer expiration
> may be relatively accurate.
> 
> E.g., At time 0, start a periodic task (period: 10,000 ns) with a simulated
> delay of 100 ns.
> 
> With this feature enabled and reasonably accurate prediction, the expiration
> time set seen by the guest are: 10000, 20000, 30000...
> 
> With this feature not enabled, the expiration times set: 10100, 20100, 30100...
> 
> But IMHO, for periodic timers, accuracy of the period seems to be the main
> concern, because it does not frequently start and stop. The incorrect period
> caused by the first timer expiration can be ignored.

I agree it's superfluous, but applying the advancement also does no harm, and
avoiding it would be moreeffort than simply letting KVM predict the first expiration.

> 2. for oneshot timer
> ====================
> 
> In [1], Marcelo treated oneshot and tscdeadline differently. Shouldn’t the
> behavior of these two timers be similar?

Yes, but they aren't identical, and so supporting both would require additional
code, complexity, and testing.

> Unlike periodic timers, both oneshot and tscdeadline timers set a specific
> expiration time, and what the guest cares about is whether the expiration
> time is accurate. Moreover, this feature is mainly intended to mitigate the
> latency introduced by timer virtualization.  Since software timers have
> higher latency compared to hardware virtual timers, the need for this feature
> is actually more urgent for software timers.

Yep.

> However, in the current implementation, the feature is applied to hv
> oneshot/period timers, but not to sw oneshot/period timers.
>
> ===============
> Summary of IMHO
> ===============
> 
> The feature should be applied to the following timer types:
> sw/hv tscdeadline and sw/hv oneshot

In a perfect world, probably?  But I don't know that it's worth changing at this
time.  Much of this is balancing complexity with benefit, though it's also most
definitely a reflection of the initial implementation.

KVM unconditionally emulates TSC-deadline mode, and AFAIK every real-world kernel
prefers TSC-deadline over one-shot, and so in practice the benefits of applying
the advancement to one-shot hrtimers.  That was also the way the world was headed
back when Marcelo first implemented the support.  I don't know for sure why the
initial implementation targeted only TSC-deadline mode, but I think it's safe to
assume that the use case Marcelo was targeting exclusively used TSC-deadline.

I'm not entirely opposed to playing the advancement games with one-shot hrtimers,
but it's also not clear to me that it's worth doing.  E.g. supporting one-shot
hrtimers would likely require a bit of extra complexity to juggle the different
time domains.  And if the only use cases that are truly sensitive to timer
programming latency exclusively use TSC-deadline mode (because one-shot mode is
inherently "fuzzy"), then any amount of extra complexity is effectively dead weight.

> should not be applied to:
> sw/hv period

I wouldn't say "should not be applied to", I think it's more "doesn't provide much
benefit to".

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ