[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <001a01dc4436$80b80aa0$82281fe0$@telus.net>
Date: Thu, 23 Oct 2025 09:02:57 -0700
From: "Doug Smythies" <dsmythies@...us.net>
To: "'Rafael J. Wysocki'" <rafael@...nel.org>
Cc: "'Frederic Weisbecker'" <frederic@...nel.org>,
"'LKML'" <linux-kernel@...r.kernel.org>,
"'Peter Zijlstra'" <peterz@...radead.org>,
"'Christian Loehle'" <christian.loehle@....com>,
"'Linux PM'" <linux-pm@...r.kernel.org>,
"Doug Smythies" <dsmythies@...us.net>
Subject: RE: [PATCH v1 1/3] cpuidle: governors: menu: Avoid selecting states with too much latency
On 2025.10.23 07:51 Rafael wrote:
> Hi Doug,
>
> On Thursday, October 23, 2025 5:05:44 AM CEST Doug Smythies wrote:
>> Hi Rafael,
>>
>> Recent email communications about other patches had me
>> looking at this one again.
>>
>> On 2025.08.13 03:26 Rafael wrote:
>>> From: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
>>>
>> ... snip...
>>
>>> However, after the above change, latency_req cannot take the predicted_ns
>>> value any more, which takes place after commit 38f83090f515 ("cpuidle:
>>> menu: Remove iowait influence"), because it may cause a polling state
>>> to be returned prematurely.
>>>
>>> In the context of the previous example say that predicted_ns is 3000 and
>>> the PM QoS latency limit is still 20 us. Additionally, say that idle
>>> state 0 is a polling one. Moving the exit_latency_ns check before the
>>> target_residency_ns one causes the loop to terminate in the second
>>> iteration, before the target_residency_ns check, so idle state 0 will be
>>> returned even though previously state 1 would be returned if there were
>>> no imminent timers.
>>>
>>> For this reason, remove the assignment of the predicted_ns value to
>>> latency_req from the code.
>>
>> Which is okay for timer-based workflow,
>> but what about non-timer based, or interrupt driven, workflow?
>>
>> Under conditions where idle state 0, or Polling, would be used a lot,
>> I am observing about a 11 % throughput regression with this patch
>> And idle state 0, polling, usage going from 20% to 0%.
>>
>> From my testing of kernels 6.17-rc1, rc2,rc3 in August and September
>> and again now. I missed this in August/September:
>>
>> 779b1a1cb13a cpuidle: governors: menu: Avoid selecting states with too much latency - v6.17-rc3
>> fa3fa55de0d6 cpuidle: governors: menu: Avoid using invalid recent intervals data - v6.17-rc2
>> baseline reference: v6.17-rc1
>>
>> teo was included also. As far as I can recall its response has always been similar to rc3. At least, recently.
>>
>> Three graphs are attached:
>> Sampling data once per 20 seconds, the test is started after the first idle sample,
>> and at least one sample is taken after the system returns to idle after the test.
>> The faster the test runs the better.
>>
>> Test computer:
>> Processor: Intel(R) Core(TM) i5-10600K CPU @ 4.10GHz
>> Distro: Ubuntu 24.04.3, server, no desktop GUI.
>> CPU frequency scaling driver: intel_pstate
>> HWP: disabled.
>> CPU frequency scaling governor: performance
>> Ilde driver: intel_idle
>> Idle governor: menu (except teo for one compare test run)
>> Idle states: 4: name : description:
>> state0/name:POLL desc:CPUIDLE CORE POLL IDLE
>> state1/name:C1_ACPI desc:ACPI FFH MWAIT 0x0
>> state2/name:C2_ACPI desc:ACPI FFH MWAIT 0x30
>> state3/name:C3_ACPI desc:ACPI FFH MWAIT 0x60
>
> OK, so since the exit residency of an idle state cannot exceed its target
> residency, the appended change (on top of 6.18-rc2) should make the throughput
> regression go away.
Indeed, the patch you appended below did make the
throughput regression go away.
Thank you.
>
> ---
> drivers/cpuidle/governors/menu.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> --- a/drivers/cpuidle/governors/menu.c
> +++ b/drivers/cpuidle/governors/menu.c
> @@ -321,10 +321,13 @@ static int menu_select(struct cpuidle_dr
>
> /*
> * Use a physical idle state, not busy polling, unless a timer
> - * is going to trigger soon enough.
> + * is going to trigger soon enough or the exit latency of the
> + * idle state in question is greater than the predicted idle
> + * duration.
> */
> if ((drv->states[idx].flags & CPUIDLE_FLAG_POLLING) &&
> - s->target_residency_ns <= data->next_timer_ns) {
> + s->target_residency_ns <= data->next_timer_ns &&
> + s->exit_latency_ns <= predicted_ns) {
> predicted_ns = s->target_residency_ns;
> idx = i;
> break;
Powered by blists - more mailing lists