[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5040239.GXAFRqVoOG@rafael.j.wysocki>
Date: Thu, 23 Oct 2025 16:51:02 +0200
From: "Rafael J. Wysocki" <rafael@...nel.org>
To: Doug Smythies <dsmythies@...us.net>
Cc: 'Frederic Weisbecker' <frederic@...nel.org>,
'LKML' <linux-kernel@...r.kernel.org>,
'Peter Zijlstra' <peterz@...radead.org>,
'Christian Loehle' <christian.loehle@....com>,
'Linux PM' <linux-pm@...r.kernel.org>, Doug Smythies <dsmythies@...us.net>
Subject:
Re: [PATCH v1 1/3] cpuidle: governors: menu: Avoid selecting states with too
much latency
Hi Doug,
On Thursday, October 23, 2025 5:05:44 AM CEST Doug Smythies wrote:
> Hi Rafael,
>
> Recent email communications about other patches had me
> looking at this one again.
>
> On 2025.08.13 03:26 Rafael wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
> >
> ... snip...
>
> > However, after the above change, latency_req cannot take the predicted_ns
> > value any more, which takes place after commit 38f83090f515 ("cpuidle:
> > menu: Remove iowait influence"), because it may cause a polling state
> > to be returned prematurely.
> >
> > In the context of the previous example say that predicted_ns is 3000 and
> > the PM QoS latency limit is still 20 us. Additionally, say that idle
> > state 0 is a polling one. Moving the exit_latency_ns check before the
> > target_residency_ns one causes the loop to terminate in the second
> > iteration, before the target_residency_ns check, so idle state 0 will be
> > returned even though previously state 1 would be returned if there were
> > no imminent timers.
> >
> > For this reason, remove the assignment of the predicted_ns value to
> > latency_req from the code.
>
> Which is okay for timer-based workflow,
> but what about non-timer based, or interrupt driven, workflow?
>
> Under conditions where idle state 0, or Polling, would be used a lot,
> I am observing about a 11 % throughput regression with this patch
> And idle state 0, polling, usage going from 20% to 0%.
>
> From my testing of kernels 6.17-rc1, rc2,rc3 in August and September
> and again now. I missed this in August/September:
>
> 779b1a1cb13a cpuidle: governors: menu: Avoid selecting states with too much latency - v6.17-rc3
> fa3fa55de0d6 cpuidle: governors: menu: Avoid using invalid recent intervals data - v6.17-rc2
> baseline reference: v6.17-rc1
>
> teo was included also. As far as I can recall its response has always been similar to rc3. At least, recently.
>
> Three graphs are attached:
> Sampling data once per 20 seconds, the test is started after the first idle sample,
> and at least one sample is taken after the system returns to idle after the test.
> The faster the test runs the better.
>
> Test computer:
> Processor: Intel(R) Core(TM) i5-10600K CPU @ 4.10GHz
> Distro: Ubuntu 24.04.3, server, no desktop GUI.
> CPU frequency scaling driver: intel_pstate
> HWP: disabled.
> CPU frequency scaling governor: performance
> Ilde driver: intel_idle
> Idle governor: menu (except teo for one compare test run)
> Idle states: 4: name : description:
> state0/name:POLL desc:CPUIDLE CORE POLL IDLE
> state1/name:C1_ACPI desc:ACPI FFH MWAIT 0x0
> state2/name:C2_ACPI desc:ACPI FFH MWAIT 0x30
> state3/name:C3_ACPI desc:ACPI FFH MWAIT 0x60
OK, so since the exit residency of an idle state cannot exceed its target
residency, the appended change (on top of 6.18-rc2) should make the throughput
regression go away.
---
drivers/cpuidle/governors/menu.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
--- a/drivers/cpuidle/governors/menu.c
+++ b/drivers/cpuidle/governors/menu.c
@@ -321,10 +321,13 @@ static int menu_select(struct cpuidle_dr
/*
* Use a physical idle state, not busy polling, unless a timer
- * is going to trigger soon enough.
+ * is going to trigger soon enough or the exit latency of the
+ * idle state in question is greater than the predicted idle
+ * duration.
*/
if ((drv->states[idx].flags & CPUIDLE_FLAG_POLLING) &&
- s->target_residency_ns <= data->next_timer_ns) {
+ s->target_residency_ns <= data->next_timer_ns &&
+ s->exit_latency_ns <= predicted_ns) {
predicted_ns = s->target_residency_ns;
idx = i;
break;
Powered by blists - more mailing lists