lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJZ5v0jpfZt-BB579mBkSkNX6em4sHKwiLYCgm4dzzkUFeexmQ@mail.gmail.com>
Date: Mon, 18 Aug 2025 19:08:07 +0200
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: Christian Loehle <christian.loehle@....com>
Cc: "Rafael J. Wysocki" <rafael@...nel.org>, Linux PM <linux-pm@...r.kernel.org>, 
	Frederic Weisbecker <frederic@...nel.org>, LKML <linux-kernel@...r.kernel.org>, 
	Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH v1 1/3] cpuidle: governors: menu: Avoid selecting states
 with too much latency

On Wed, Aug 13, 2025 at 9:13 PM Christian Loehle
<christian.loehle@....com> wrote:
>
> On 8/13/25 11:25, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
> >
> > Occasionally, the exit latency of the idle state selected by the menu
> > governor may exceed the PM QoS CPU wakeup latency limit.  Namely, if the
> > scheduler tick has been stopped already and predicted_ns is greater than
> > the tick period length, the governor may return an idle state whose exit
> > latency exceeds latency_req because that decision is made before
> > checking the current idle state's exit latency.
> >
> > For instance, say that there are 3 idle states, 0, 1, and 2.  For idle
> > states 0 and 1, the exit latency is equal to the target residency and
> > the values are 0 and 5 us, respectively.  State 2 is deeper and has the
> > exit latency and target residency of 200 us and 2 ms (which is greater
> > than the tick period length), respectively.
> >
> > Say that predicted_ns is equal to TICK_NSEC and the PM QoS latency
> > limit is 20 us.  After the first two iterations of the main loop in
> > menu_select(), idx becomes 1 and in the third iteration of it the target
> Can drop "of it" here?

But it also doesn't really hurt I think.

> > residency of the current state (state 2) is greater than predicted_ns.
> > State 2 is not a polling one and predicted_ns is not less than TICK_NSEC,
> > so the check on whether or not the tick has been stopped is done.  Say
> > that the tick has been stopped already and there are no imminent timers
> > (that is, delta_tick is greater than the target residency of state 2).
> > In that case, idx becomes 2 and it is returned immediately, but the exit
> > latency of state 2 exceeds the latency limit.
> >
> > Address this issue by modifying the code to compare the exit latency of
> > the current idle state (idle state i) with the latency limit before
> > comparing its target residecy with predicted_ns, which allows one
> residency

Fixed while applying.

> > more exit_latency_ns check that becomes redundant to be dropped.
> >
> > However, after the above change, latency_req cannot take the predicted_ns
> > value any more, which takes place after commit 38f83090f515 ("cpuidle:
> > menu: Remove iowait influence"), because it may cause a polling state
> > to be returned prematurely.
> >
> > In the context of the previous example say that predicted_ns is 3000 and
> > the PM QoS latency limit is still 20 us.  Additionally, say that idle
> > state 0 is a polling one.  Moving the exit_latency_ns check before the
> > target_residency_ns one causes the loop to terminate in the second
> > iteration, before the target_residency_ns check, so idle state 0 will be
> > returned even though previously state 1 would be returned if there were
> > no imminent timers.
> >
> > For this reason, remove the assignment of the predicted_ns value to
> > latency_req from the code.
> >
> > Fixes: 5ef499cd571c ("cpuidle: menu: Handle stopped tick more aggressively")
> > Cc: 4.17+ <stable@...r.kernel.org> # 4.17+
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
> > ---
> >  drivers/cpuidle/governors/menu.c |   29 ++++++++++++-----------------
> >  1 file changed, 12 insertions(+), 17 deletions(-)
> >
> > --- a/drivers/cpuidle/governors/menu.c
> > +++ b/drivers/cpuidle/governors/menu.c
> > @@ -287,20 +287,15 @@
> >               return 0;
> >       }
> >
> > -     if (tick_nohz_tick_stopped()) {
> > -             /*
> > -              * If the tick is already stopped, the cost of possible short
> > -              * idle duration misprediction is much higher, because the CPU
> > -              * may be stuck in a shallow idle state for a long time as a
> > -              * result of it.  In that case say we might mispredict and use
> > -              * the known time till the closest timer event for the idle
> > -              * state selection.
> > -              */
> > -             if (predicted_ns < TICK_NSEC)
> > -                     predicted_ns = data->next_timer_ns;
> > -     } else if (latency_req > predicted_ns) {
> > -             latency_req = predicted_ns;
> > -     }
> > +     /*
> > +      * If the tick is already stopped, the cost of possible short idle
> > +      * duration misprediction is much higher, because the CPU may be stuck
> > +      * in a shallow idle state for a long time as a result of it.  In that
> > +      * case, say we might mispredict and use the known time till the closest
> > +      * timer event for the idle state selection.
> > +      */
> > +     if (tick_nohz_tick_stopped() && predicted_ns < TICK_NSEC)
> > +             predicted_ns = data->next_timer_ns;
> >
> >       /*
> >        * Find the idle state with the lowest power while satisfying
> > @@ -316,13 +311,15 @@
> >               if (idx == -1)
> >                       idx = i; /* first enabled state */
> >
> > +             if (s->exit_latency_ns > latency_req)
> > +                     break;
> > +
> >               if (s->target_residency_ns > predicted_ns) {
> >                       /*
> >                        * Use a physical idle state, not busy polling, unless
> >                        * a timer is going to trigger soon enough.
> >                        */
> >                       if ((drv->states[idx].flags & CPUIDLE_FLAG_POLLING) &&
> > -                         s->exit_latency_ns <= latency_req &&
> >                           s->target_residency_ns <= data->next_timer_ns) {
> >                               predicted_ns = s->target_residency_ns;
> >                               idx = i;
> > @@ -354,8 +351,6 @@
> >
> >                       return idx;
> >               }
> > -             if (s->exit_latency_ns > latency_req)
> > -                     break;
> >
> >               idx = i;
> >       }
> >
> >
> >
>
> Good catch!
> Reviewed-by: Christian Loehle <christian.loehle@....com>

Thank you!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ