[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <54247242-3eef-4b99-a235-81dc6637d725@linux.ibm.com>
Date: Wed, 28 Jan 2026 13:26:40 +0530
From: Shrikanth Hegde <sshegde@...ux.ibm.com>
To: K Prateek Nayak <kprateek.nayak@....com>,
"Guo, Wangyang" <wangyang.guo@...el.com>
Cc: linux-kernel@...r.kernel.org, Benjamin Lei <benjamin.lei@...el.com>,
Tim Chen <tim.c.chen@...ux.intel.com>,
Tianyou Li <tianyou.li@...el.com>, Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>,
Mel Gorman <mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>
Subject: Re: [PATCH v7] sched/clock: Avoid false sharing for
sched_clock_irqtime
On 1/28/26 1:20 PM, K Prateek Nayak wrote:
> On 1/28/2026 1:02 PM, Shrikanth Hegde wrote:
>>
>>
>> On 1/28/26 12:48 PM, K Prateek Nayak wrote:
>>> On 1/28/2026 11:56 AM, Shrikanth Hegde wrote:
>>>>
>>>>
>>>> On 1/28/26 8:35 AM, K Prateek Nayak wrote:
>>>>> On 1/28/2026 7:49 AM, Guo, Wangyang wrote:
>>>>>> Yes, when clock mark unstable through tsc_.*mark_unstable() with non-native_sched_clock, clear_sched_clock_stable won't be called, thus sched_clock_irqtime still keep enabled.
>>>>>>
>>>>>> Maybe the dedicated workqueue for sched_clock_irqtime is still needed considering this case.
>>>>>
>>>>> In that case, shouldn't tsc_init() only enable irqtime when
>>>>> using_native_sched_clock()? How can tsc_init() make a call on irqtime if
>>>>> TSC isn't being used as the sched_clock() ultimately?
>>>>>
>>>>> For kvmclock, if PVCLOCK_TSC_STABLE_BIT is not set, it'll call
>>>>> clear_sched_clock_stable() at kvm_sched_clock_init() but none of the
>>>>> other clocksources do so we can assume once we override the sched_clock()
>>>>> it is up to the sched_clock() provider to deal with the clock stability.
>>>>>
>>>>
>>>> I think this would depend if mark_tsc_unstable happens after system boot,
>>>> specially while running kvm guest?
>>>
>>> I don't see anything on the guest side that would mark the kvmclock as
>>> unstable if host's TSC turns unstable post init and since kvmclock
>>> doesn't set CLOCK_SOURCE_MUST_VERIFY, I doubt if a watchdog runs to
>>> verify it in the guest.
>>>
>>> I have the following in the guest:
>>>
>>> $ sudo dmesg | grep -i clock
>>> [ 0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
>>> [ 0.000000] kvm-clock: using sched offset of 423259259 cycles
>>
>> This means pv_sched_clock is kvm_sched_clock_read from now. and
>> irqtime is enabled in the guest. right?
>
> So within the guest today ...
>
> $ sudo dmesg | grep -i "clock\|tsc"
> [ 0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
> [ 0.000000] kvm-clock: using sched offset of 504626078 cycles
>
> # kvm_sched_clock_init() happens here so it can potentially do
> # clear_sched_clock_stable() here if !PVCLOCK_TSC_STABLE_BIT.
>
> [ 0.000002] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
> [ 0.000004] tsc: Detected 1996.251 MHz processor
>
> # We enable irqtime here once TSC frequency has been determined
> # without considering using_native_sched_clock()
>
>
> After that TSC is never selected so we don't care if it is stable
> or not since it is not the clocksource - the guest continues on
> with unstable sched_clock() but also irqtime enabled since TSC
> was calibrated successfully.
>
>>
>>> [ 0.000002] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
>>> [ 0.071675] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns
>>> [ 0.378467] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604467 ns
>>> [ 0.388678] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x398cb1e4d56, max_idle_ns: 881590790753 ns
>>> [ 0.679262] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
>>> [ 0.903121] PTP clock support registered
>>> [ 0.927243] clocksource: Switched to clocksource kvm-clock
>>> [ 0.944986] clocksource: acpi_pm: mask: 0xffffff max_cycles: 0xffffff, max_idle_ns: 2085701024 ns
>>> [ 0.993198] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x398cb1e4d56, max_idle_ns: 881590790753 ns
>>> [ 1.123796] rtc_cmos 00:05: setting system clock to 2026-01-28T07:03:45 UTC (1769583825)
>>> [ 1.155755] sched_clock: Marking stable (940009972, 212965288)->(1171254846, -18279586)
>>> [ 1.712598] clk: Disabling unused clocks
>>>
>>> Then I mark TSC unstable on the host
>>>
>>> tsc: Marking TSC unstable due to Faking unreliable TSC!
>>> TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'.
>>> clocksource: Checking clocksource tsc synchronization from CPU 93 to CPUs 0,2,26,75,101,114,118,195.
>>> sched_clock: Marking unstable (945948313746, 69389667)<-(947618130068, -1600430832)
>>> clocksource: CPU 93 check durations 3436ns - 25277ns for clocksource tsc.
>>> clocksource: Switched to clocksource hpet
>>>
>>
>> so now, using_native_sched_clock should fail in guest? If so, with the patch,
>> irqtime won't be disabled no?
>
> Ideally yes, but the guest continues using kvmclock without any hitch.
> I think the x86 KVM layer has something to ensure stability but I'm
> not 100% sure.
>
> Since I don't see "tsc: Marking TSC unstable ..." or "sched_clock:
> Marking unstable ..." in the guest, we don't hit the mark_tsc_unstable()
> path within the guest which would disable irqtime today so essentially
> host's TSC turning changing doesn't seem to affect the guest.
>
>>
Okay. Fair enough.
Then v7 should cover all scenarios i think. with that,
Reviewed-by: Shrikanth Hegde <sshegde@...ux.ibm.com>
Powered by blists - more mailing lists