[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aPFcKDJIP4-maoqI@swahl-home.5wahls.com>
Date: Thu, 16 Oct 2025 15:57:12 -0500
From: Steve Wahl <steve.wahl@....com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: Steve Wahl <steve.wahl@....com>,
Anna-Maria Behnsen <anna-maria@...utronix.de>,
Frederic Weisbecker <frederic@...nel.org>,
Ingo Molnar <mingo@...nel.org>, linux-kernel@...r.kernel.org,
Russ Anderson <rja@....com>, Dimitri Sivanich <sivanich@....com>,
Kyle Meyer <kyle.meyer@....com>
Subject: Re: [PATCH] tick/sched: Use trylock for jiffies updates by
non-timekeeper CPUs
On Thu, Oct 16, 2025 at 10:07:07PM +0200, Thomas Gleixner wrote:
> On Mon, Oct 13 2025 at 10:09, Steve Wahl wrote:
> > -static void tick_do_update_jiffies64(ktime_t now)
> > +static bool _tick_do_update_jiffies64(ktime_t now, bool trylock)
> > {
> > unsigned long ticks = 1;
> > ktime_t delta, nextp;
> > @@ -70,7 +70,7 @@ static void tick_do_update_jiffies64(ktime_t now)
> > */
> > if (IS_ENABLED(CONFIG_64BIT)) {
> > if (ktime_before(now, smp_load_acquire(&tick_next_period)))
> > - return;
> > + return true;
> > } else {
> > unsigned int seq;
> >
> > @@ -84,18 +84,24 @@ static void tick_do_update_jiffies64(ktime_t now)
> > } while (read_seqcount_retry(&jiffies_seq, seq));
> >
> > if (ktime_before(now, nextp))
> > - return;
> > + return true;
> > }
> >
> > /* Quick check failed, i.e. update is required. */
> > - raw_spin_lock(&jiffies_lock);
> > + if (trylock) {
> > + /* The cpu holding the lock will do the update. */
> > + if (!raw_spin_trylock(&jiffies_lock))
> > + return false;
> > + } else {
> > + raw_spin_lock(&jiffies_lock);
> > + }
>
> Why inflicting this horrible conditional locking scheme into the main
> path? This can be solved without all this churn completely independent
> from this function.
Why? Because my first crack at the problem was just change to a
trylock at all times. But I saw some callers might depend on time
being updated before return. And I would say I didn't "zoom out" far
enough from the simple fix when trying to accomodate that.
> Something like the uncompiled below. You get the idea.
I like this approach.
The reason I'm getting in to this situation seems to be that the
designated timekeeper is actually doing the jiffies update work;
holding the lock, hasn't finished processing yet. Under those
conditions, this approach will have one extra CPU stuck waiting on the
jiffies lock.
But that's far better than thousands, and I think would be acceptable
tradeoff for code readability. I will make it compile and see how it
tests, and proceed to make it become patch v2 if it seems OK.
> Thanks,
>
> tglx
Thank you!
--> Steve Wahl
> ---
> --- a/kernel/time/tick-sched.c
> +++ b/kernel/time/tick-sched.c
> @@ -203,6 +203,21 @@ static inline void tick_sched_flag_clear
>
> #define MAX_STALLED_JIFFIES 5
>
> +static bool tick_try_update_jiffies64(struct tick_sched *ts, ktime_t now)
> +{
> + static atomic_t in_progress;
> + int inp;
> +
> + inp = atomic_read(&in_progress);
> + if (inp || !atomic_try_cmpxchg(&in_progress, &inp, 1))
> + return false;
> +
> + if (ts->last_tick_jiffies == jiffies)
> + tick_do_update_jiffies64(now);
> + atomic_set(&in_progress, 0);
> + return true;
> +}
> +
> static void tick_sched_do_timer(struct tick_sched *ts, ktime_t now)
> {
> int tick_cpu, cpu = smp_processor_id();
> @@ -239,10 +254,11 @@ static void tick_sched_do_timer(struct t
> ts->stalled_jiffies = 0;
> ts->last_tick_jiffies = READ_ONCE(jiffies);
> } else {
> - if (++ts->stalled_jiffies == MAX_STALLED_JIFFIES) {
> - tick_do_update_jiffies64(now);
> - ts->stalled_jiffies = 0;
> - ts->last_tick_jiffies = READ_ONCE(jiffies);
> + if (++ts->stalled_jiffies >= MAX_STALLED_JIFFIES) {
> + if (tick_try_update_jiffies64(ts, now)) {
> + ts->stalled_jiffies = 0;
> + ts->last_tick_jiffies = READ_ONCE(jiffies);
> + }
> }
> }
>
--
Steve Wahl, Hewlett Packard Enterprise
Powered by blists - more mailing lists