[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3b69443c-f82c-4b73-8384-2a039d730213@intel.com>
Date: Wed, 28 Jan 2026 10:19:21 +0800
From: "Guo, Wangyang" <wangyang.guo@...el.com>
To: Shrikanth Hegde <sshegde@...ux.ibm.com>,
K Prateek Nayak <kprateek.nayak@....com>
Cc: linux-kernel@...r.kernel.org, Benjamin Lei <benjamin.lei@...el.com>,
Tim Chen <tim.c.chen@...ux.intel.com>, Tianyou Li <tianyou.li@...el.com>,
Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>,
Mel Gorman <mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>
Subject: Re: [PATCH v7] sched/clock: Avoid false sharing for
sched_clock_irqtime
On 1/27/2026 7:04 PM, Shrikanth Hegde wrote:
> Hi Wangyang, Prateek.
>
> On 1/27/26 12:55 PM, Wangyang Guo wrote:
>> Read-mostly sched_clock_irqtime may share the same cacheline with
>> frequently updated nohz struct. Make it as static_key to avoid
>> false sharing issue.
>>
>> The only user of disable_sched_clock_irqtime()
>> is tsc_.*mark_unstable() which may be invoked under atomic context
>> and require a workqueue to disable static_key. But both of them
>> calls clear_sched_clock_stable() just before doing
>> disable_sched_clock_irqtime(). We can reuse
>> "sched_clock_work" to also disable sched_clock_irqtime().
>>
>> One additional case need to handle is if the tsc is marked unstable
>> before late_initcall() phase, sched_clock_work will not be invoked
>> and sched_clock_irqtime will stay enabled although clock is unstable:
>> tsc_init()
>> enable_sched_clock_irqtime() # irqtime accounting is enabled here
>> ...
>> if (unsynchronized_tsc()) # true
>> mark_tsc_unstable()
>> clear_sched_clock_stable()
>> __sched_clock_stable_early = 0;
>> ...
>> if (static_key_count(&sched_clock_running.key) == 2)
>> # Only happens at sched_clock_init_late()
>> __clear_sched_clock_stable(); # Never executed
>> ...
>>
>> # late_initcall() phase
>> sched_clock_init_late()
>> if (__sched_clock_stable_early) # Already false
>> __set_sched_clock_stable(); # sched_clock is never marked stable
>> # TSC unstable, but sched_clock_work won't run to disable irqtime
>>
>> So we need to disable_sched_clock_irqtime() in sched_clock_init_late()
>> if clock is unstable.
>>
>
> Do you this as a valid case? have you tested with CONFIG_PARAVIRT?
>
> Lets say you have a non native sched clock such as kvm_sched_clock_read.
> tsc_init -> sets enable_sched_clock_irqtime()
> ->mark_tsc_unstable -> if using_native_sched_clock ->
> clear_sched_clock_stable
>
> In this case, since clear_sched_clock_stable won't be called you may not
> disable the
> sched clock irqtime since __sched_clock_stable_early is reset only in
> clear_sched_clock_stable
For hypervisor, I see this path may call clear_sched_clock_stable when
clock is unstable at init:
kvm_init_platform() ->
kvmclock_init() -> kvm_sched_clock_init(stable):
if (!stable) clear_sched_clock_stable()
paravirt_set_sched_clock(kvm_sched_clock_read)
>
> Bigger concern(maybe) is clock source marked as stable still, though
> called mark_tsc_unstable in
> non native sched clock?
>
> Disclaimer: (just curious, seeing this x86 code for first time, so may
> not know all paths)
>
Yes, when clock mark unstable through tsc_.*mark_unstable() with
non-native_sched_clock, clear_sched_clock_stable won't be called, thus
sched_clock_irqtime still keep enabled.
Maybe the dedicated workqueue for sched_clock_irqtime is still needed
considering this case.
Powered by blists - more mailing lists