[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aPJnxDj4mFSJc0tV@google.com>
Date: Fri, 17 Oct 2025 08:59:00 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: fuqiang wang <fuqiang.wng@...il.com>
Cc: Paolo Bonzini <pbonzini@...hat.com>, Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
"H . Peter Anvin" <hpa@...or.com>, Maxim Levitsky <mlevitsk@...hat.com>, kvm@...r.kernel.org,
linux-kernel@...r.kernel.org, yu chen <chen.yu@...ystack.com>,
dongxu zhang <dongxu.zhang@...ystack.com>
Subject: Re: [PATCH RESEND] avoid hv timer fallback to sw timer if delay
exceeds period
On Fri, Oct 17, 2025, fuqiang wang wrote:
> On 10/14/25 7:29 AM, Sean Christopherson wrote:
> > On Wed, Oct 01, 2025, fuqiang wang wrote:
> > The only code that cares is __kvm_wait_lapic_expire(), and the only downside to
> > setting tscdeadline=L1.TSC is that adjust_lapic_timer_advance() won't adjust as
> > aggressively as it probably should.
>
> I am not sure which type of timers should use the "advanced tscdeadline hrtimer
> expiration feature".
>
> I list the history of this feature.
>
> 1. Marcelo first introduce this feature, only support the tscdeadline sw timer.
> 2. Yunhong introduce vmx preemption timer(hv), only support for tscdeadline.
> 3. Liwanpeng extend the hv timer to oneshot and period timers.
> 4. Liwanpeng extend this feature to hv timer.
> 5. Sean and liwanpeng fix some BUG extend this feature to hv period/oneshot timer.
>
> [1] d0659d946be0("KVM: x86: add option to advance tscdeadline hrtimer expiration")
> Marcelo Tosatti Dec 16 2014
> [2] ce7a058a2117("KVM: x86: support using the vmx preemption timer for tsc deadline timer")
> Yunhong Jiang Jun 13 2016
> [3] 8003c9ae204e("KVM: LAPIC: add APIC Timer periodic/oneshot mode VMX preemption timer support")
> liwanpeng Oct 24 2016
> [4] c5ce8235cffa("KVM: VMX: Optimize tscdeadline timer latency")
> liwanpeng May 29 2018
> [5] ee66e453db13("KVM: lapic: Busy wait for timer to expire when using hv_timer")
> Sean Christopherson Apr 16 2019
>
> d981dd15498b("KVM: LAPIC: Accurately guarantee busy wait for timer to expire when using hv_timer")
> liwanpeng Apr 28 2021
>
> Now, timers supported for this feature includes:
> - sw: tscdeadline
> - hv: all (tscdeadline, oneshot, period)
>
> ====
> IMHO
> ====
>
> 1. for period timer
> ===================
>
> I think for periodic timers emulation, the expiration time is already adjusted
> to compensate for the delays introduced by timer emulation, so don't need this
> feature to adjust again. But after use the feature, the first timer expiration
> may be relatively accurate.
>
> E.g., At time 0, start a periodic task (period: 10,000 ns) with a simulated
> delay of 100 ns.
>
> With this feature enabled and reasonably accurate prediction, the expiration
> time set seen by the guest are: 10000, 20000, 30000...
>
> With this feature not enabled, the expiration times set: 10100, 20100, 30100...
>
> But IMHO, for periodic timers, accuracy of the period seems to be the main
> concern, because it does not frequently start and stop. The incorrect period
> caused by the first timer expiration can be ignored.
I agree it's superfluous, but applying the advancement also does no harm, and
avoiding it would be moreeffort than simply letting KVM predict the first expiration.
> 2. for oneshot timer
> ====================
>
> In [1], Marcelo treated oneshot and tscdeadline differently. Shouldn’t the
> behavior of these two timers be similar?
Yes, but they aren't identical, and so supporting both would require additional
code, complexity, and testing.
> Unlike periodic timers, both oneshot and tscdeadline timers set a specific
> expiration time, and what the guest cares about is whether the expiration
> time is accurate. Moreover, this feature is mainly intended to mitigate the
> latency introduced by timer virtualization. Since software timers have
> higher latency compared to hardware virtual timers, the need for this feature
> is actually more urgent for software timers.
Yep.
> However, in the current implementation, the feature is applied to hv
> oneshot/period timers, but not to sw oneshot/period timers.
>
> ===============
> Summary of IMHO
> ===============
>
> The feature should be applied to the following timer types:
> sw/hv tscdeadline and sw/hv oneshot
In a perfect world, probably? But I don't know that it's worth changing at this
time. Much of this is balancing complexity with benefit, though it's also most
definitely a reflection of the initial implementation.
KVM unconditionally emulates TSC-deadline mode, and AFAIK every real-world kernel
prefers TSC-deadline over one-shot, and so in practice the benefits of applying
the advancement to one-shot hrtimers. That was also the way the world was headed
back when Marcelo first implemented the support. I don't know for sure why the
initial implementation targeted only TSC-deadline mode, but I think it's safe to
assume that the use case Marcelo was targeting exclusively used TSC-deadline.
I'm not entirely opposed to playing the advancement games with one-shot hrtimers,
but it's also not clear to me that it's worth doing. E.g. supporting one-shot
hrtimers would likely require a bit of extra complexity to juggle the different
time domains. And if the only use cases that are truly sensitive to timer
programming latency exclusively use TSC-deadline mode (because one-shot mode is
inherently "fuzzy"), then any amount of extra complexity is effectively dead weight.
> should not be applied to:
> sw/hv period
I wouldn't say "should not be applied to", I think it's more "doesn't provide much
benefit to".
Powered by blists - more mailing lists