linux-kernel - RE: [RFT][PATCH v1 0/5] cpuidle: menu: Avoid discarding useful information when processing recent idle intervals

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <005801db9397$266ddac0$73499040$@telus.net>
Date: Wed, 12 Mar 2025 14:38:52 -0700
From: "Doug Smythies" <dsmythies@...us.net>
To: "'Artem Bityutskiy'" <artem.bityutskiy@...ux.intel.com>,
	"'Rafael J. Wysocki'" <rjw@...ysocki.net>
Cc: "'LKML'" <linux-kernel@...r.kernel.org>,
	"'Daniel Lezcano'" <daniel.lezcano@...aro.org>,
	"'Christian Loehle'" <christian.loehle@....com>,
	"'Aboorva Devarajan'" <aboorvad@...ux.ibm.com>,
	"'Linux PM'" <linux-pm@...r.kernel.org>,
	"Doug Smythies" <dsmythies@...us.net>
Subject: RE: [RFT][PATCH v1 0/5] cpuidle: menu: Avoid discarding useful information when processing recent idle intervals

On 2025.02.07 06:49 Artem Bityutskiy wrote:

> Hi,
>
> thanks for the patches!
> 
> On Thu, 2025-02-06 at 15:21 +0100, Rafael J. Wysocki wrote:
>> Hi Everyone,
>> 
>> This work had been triggered by a report that commit 0611a640e60a ("eventpoll:
>> prefer kfree_rcu() in __ep_remove()") had caused the critical-jOPS metric of
>> the SPECjbb 2015 benchmark [1] to drop by around 50% even though it generally
>> reduced kernel overhead.  Indeed, it was found during further investigation
>> that the total interrupt rate while running the SPECjbb workload had fallen as
>> a result of that commit by 55% and the local timer interrupt rate had fallen
>> by almost 80%.
>
> I ran SPECjbb2015 with and it doubles critical-jOPS and basically makes it
> "normal" again. Thanks!
>
> Reported-by: Artem Bityutskiy <artem.bityutskiy@...ux.intel.com>
> Tested-by: Artem Bityutskiy <artem.bityutskiy@...ux.intel.com>

None of the tests I run/ran show anywhere near that magnitude of change,
and because it is not a free test, I thought I would try to create one.
For my "critical-jobs" test I searched for differences over a wide range of
jobs per second and workload per job. I found one significant difference, but
the opposite of Artem's SPECjbb results:

		load	seconds	total jobs	jps	min	50th	90th	95th	99th	max
menu-614	1	60		1174500	19575	160	470	620	650	800	1620
menu-614-p	1	60		1175251	19588	160	530	690	720	860	1600
							0.1%	0.0%	12.8%	11.3%	10.8%	7.5%	-1.2%

menu-614	2	60		1102431	18374	250	600	750	790	930	2600
menu-614-p	2	60		1111070	18518	260	560	690	730	860	1360
							0.8%	4.0%	-6.7%	-8.0%	-7.6%	-7.5%	-47.7%

menu-614	3	60		987408		16457	340	920	1040	1090	1210	7100
menu-614-p	3	60		1000063	16668	340	750	850	890	980	2390
							1.3%	0.0%	-18.5%	-18.3%	-18.3%	-19.0%	-66.3%

menu-614	4	60		914690		15245	410	1510	1830	1860	1980	3630
menu-614-p	4	60		927129		15452	440	11790	14920	15160	15400	95720
							1.4%	7.3%	680.8%	715.3%	715.1%	677.8%	2536.9%

menu-614	5	60		885468		14758	540	9680	11400	11800	15460	74040
menu-614-p	5	60		895095		14918	570	25430	30150	30640	31250	137830
							1.1%	5.6%	162.7%	164.5%	159.7%	102.1%	86.2%

menu-614	6	60		840939		14016	630	45660	52070	57750	84340	189980
menu-614-p	6	60		843512		14059	620	44750	52220	58750	85930	199990
							0.3%	-1.6%	-2.0%	0.3%	1.7%	1.9%	5.3%

menu-614	7	60		797438		13291	740	61420	68130	71040	101060	199990
menu-614-p	7	60		796645		13277	670	55630	63790	68140	98920	199990
							-0.1%	-9.5%	-9.4%	-6.4%	-4.1%	-2.1%	0.0%

Notes:
menu-614 = kernel 6.14-RC1
menu-614-p = kernel 6.14RC1 + this patch set
I am still on RC1 because of earlier testing, reported a few weeks ago.
load is arbitrary, but 2 does twice as much work as 1 and so on.
(for most of this work the load has been between 10 and 1000.)
Jps = jobs per second and is queuing task limited for these particular test runs.
min, percentiles, and max columns are in units of microseconds of job execution time.
For the percent calculations, negative is better.
The data is clamped at 199,990 uSeconds, so we don't actually know what 3 of the max's were,
not that we really care. It is more the 95th percentile area we care about.

I am not suggesting that overall the patch set isn't a net positive.
Just that I found conditions where the results are poor.

So, what's the point of this email?
With respect to the other thread "TEO as default governor ?" [1]
That is such a difficult question, as there are often conflicting results.
But overall, for my testing they are very similar these days.

[1] https://lore.kernel.org/linux-pm/d6de2118-eae1-4abb-818b-b3420732c82a@arm.com/T/#t