[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aMwgOJc6Hq17uFzj@localhost.localdomain>
Date: Thu, 18 Sep 2025 17:07:36 +0200
From: Frederic Weisbecker <frederic@...nel.org>
To: "Rafael J. Wysocki" <rafael@...nel.org>
Cc: Linux PM <linux-pm@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
Christian Loehle <christian.loehle@....com>
Subject: Re: [PATCH v1 3/3] cpuidle: governors: menu: Special-case nohz_full
CPUs
Le Thu, Sep 11, 2025 at 07:07:42PM +0200, Rafael J. Wysocki a écrit :
> On Thu, Sep 11, 2025 at 4:17 PM Frederic Weisbecker <frederic@...nel.org> wrote:
> > So, when !tick_nohz_full_cpu(dev->cpu), what is the purpose of this tick stopped
> > special case?
> >
> > Is it because the next dynamic tick is a better prediction than the typical
> > interval once the tick is stopped?
>
> When !tick_nohz_full_cpu(dev->cpu), the tick is a safety net against
> getting stuck in a shallow idle state for too long. In that case, if
> the tick is stopped, the safety net is not there and it is better to
> use a deep state.
Right.
> However, data->next_timer_ns is a lower limit for the idle state
> target residency because this is when the next timer is going to
> trigger.
Ok.
>
> > Does that mean we might become more "pessimistic" concerning the predicted idle
> > time for nohz_full CPUs?
>
> Yes, and not just we might, but we do unless the idle periods in the
> workload are "long".
Ok.
>
> > I guess too shallow C-states are still better than too deep but there should be
> > a word about that introduced side effect (if any).
>
> Yeah, I agree.
>
> That said, on a nohz_full CPU there is no safety net against getting
> stuck in a shallow idle state because the tick is not present. That's
> why currently the governors don't allow shallow states to be used on
> nohz_full CPUs.
>
> The lack of a safety net is generally not a problem when the CPU has
> been isolated to run something doing real work all the time, with
> possible idle periods in the workload, but there are people who
> isolate CPUs for energy-saving reasons and don't run anything on them
> on purpose. For those folks, the current behavior to select deep idle
> states every time is actually desirable.
So far I haven't heard from anybody using nohz_full for powersavings. If
you have I'd be curious about it. Whether a task runs tickless or not, it
still runs and the CPU isn't sleeping. Also CPU 0 stays periodic on nohz_full,
which alone is a problem for powersaving but also prevents a whole package
from entering low power mode on NUMA.
Let's say it not optimized toward powersaving...
> So there are two use cases that cannot be addressed at once and I'm
> thinking about adding a control knob to allow the user to decide which
> way to go.
I'm tempted to say we should focus on having not too deep states,
at the expense of having too shallow. Of course I'm not entirely
comfortable with the idea because nohz_full CPUs may be idle for a while
on some workloads. And everyone deserves a rest at some point after
a long day.
I guess force restarting the tick upon idle entry would probably be
bad for tiny idle round-trips?
As for such a knob, I'm not sure anybody would use it.
Thanks.
--
Frederic Weisbecker
SUSE Labs
Powered by blists - more mailing lists