lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <001a01dc4436$80b80aa0$82281fe0$@telus.net>
Date: Thu, 23 Oct 2025 09:02:57 -0700
From: "Doug Smythies" <dsmythies@...us.net>
To: "'Rafael J. Wysocki'" <rafael@...nel.org>
Cc: "'Frederic Weisbecker'" <frederic@...nel.org>,
	"'LKML'" <linux-kernel@...r.kernel.org>,
	"'Peter Zijlstra'" <peterz@...radead.org>,
	"'Christian Loehle'" <christian.loehle@....com>,
	"'Linux PM'" <linux-pm@...r.kernel.org>,
	"Doug Smythies" <dsmythies@...us.net>
Subject: RE: [PATCH v1 1/3] cpuidle: governors: menu: Avoid selecting states with too much latency

On 2025.10.23 07:51 Rafael wrote:

> Hi Doug,
>
> On Thursday, October 23, 2025 5:05:44 AM CEST Doug Smythies wrote:
>> Hi Rafael,
>> 
>> Recent email communications about other patches had me
>> looking at this one again. 
>> 
>> On 2025.08.13 03:26 Rafael wrote:
>>> From: Rafael J. Wysocki <rafael.j.wysocki@...el.com>
>>>
>> ... snip...
>> 
>>> However, after the above change, latency_req cannot take the predicted_ns
>>> value any more, which takes place after commit 38f83090f515 ("cpuidle:
>>> menu: Remove iowait influence"), because it may cause a polling state
>>> to be returned prematurely.
>>>
>>> In the context of the previous example say that predicted_ns is 3000 and
>>> the PM QoS latency limit is still 20 us.  Additionally, say that idle
>>> state 0 is a polling one.  Moving the exit_latency_ns check before the
>>> target_residency_ns one causes the loop to terminate in the second
>>> iteration, before the target_residency_ns check, so idle state 0 will be
>>> returned even though previously state 1 would be returned if there were
>>> no imminent timers.
>>>
>>> For this reason, remove the assignment of the predicted_ns value to
>>> latency_req from the code.
>> 
>> Which is okay for timer-based workflow,
>> but what about non-timer based, or interrupt driven, workflow?
>> 
>> Under conditions where idle state 0, or Polling, would be used a lot,
>> I am observing about a 11 % throughput regression with this patch
>> And idle state 0, polling, usage going from 20% to 0%. 
>> 
>> From my testing of kernels 6.17-rc1, rc2,rc3 in August and September
>> and again now. I missed this in August/September:
>> 
>> 779b1a1cb13a cpuidle: governors: menu: Avoid selecting states with too much latency - v6.17-rc3
>> fa3fa55de0d6 cpuidle: governors: menu: Avoid using invalid recent intervals data - v6.17-rc2
>> baseline reference: v6.17-rc1
>> 
>> teo was included also. As far as I can recall its response has always been similar to rc3. At least, recently.
>> 
>> Three graphs are attached:
>> Sampling data once per 20 seconds, the test is started after the first idle sample,
>> and at least one sample is taken after the system returns to idle after the test.
>> The faster the test runs the better.
>> 
>> Test computer:
>> Processor: Intel(R) Core(TM) i5-10600K CPU @ 4.10GHz
>> Distro: Ubuntu 24.04.3, server, no desktop GUI.
>> CPU frequency scaling driver: intel_pstate
>> HWP: disabled.
>> CPU frequency scaling governor: performance
>> Ilde driver: intel_idle
>> Idle governor: menu (except teo for one compare test run)
>> Idle states: 4: name : description:
>>   state0/name:POLL                desc:CPUIDLE CORE POLL IDLE
>>   state1/name:C1_ACPI          desc:ACPI FFH MWAIT 0x0
>>   state2/name:C2_ACPI          desc:ACPI FFH MWAIT 0x30
>>   state3/name:C3_ACPI          desc:ACPI FFH MWAIT 0x60
>
> OK, so since the exit residency of an idle state cannot exceed its target
> residency, the appended change (on top of 6.18-rc2) should make the throughput
> regression go away.

Indeed, the patch you appended below did make the
throughput regression go away.

Thank you.

>
> ---
> drivers/cpuidle/governors/menu.c |    7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> --- a/drivers/cpuidle/governors/menu.c
> +++ b/drivers/cpuidle/governors/menu.c
> @@ -321,10 +321,13 @@ static int menu_select(struct cpuidle_dr
> 
> 		/*
> 		 * Use a physical idle state, not busy polling, unless a timer
> -		 * is going to trigger soon enough.
> +		 * is going to trigger soon enough or the exit latency of the
> +		 * idle state in question is greater than the predicted idle
> +		 * duration.
> 		 */
> 		if ((drv->states[idx].flags & CPUIDLE_FLAG_POLLING) &&
> -		    s->target_residency_ns <= data->next_timer_ns) {
> +		    s->target_residency_ns <= data->next_timer_ns &&
> +		    s->exit_latency_ns <= predicted_ns) {
> 			predicted_ns = s->target_residency_ns;
> 			idx = i;
> 			break;



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ