linux-kernel - Re: [PATCH v2 RSEND] sched/fair: Optimize EAS energy calculation complexity from O(N) to O(1) inside inner loop

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20260204121145.3951995-1-realwujing@gmail.com>
Date: Wed,  4 Feb 2026 07:11:41 -0500
From: Qiliang Yuan <realwujing@...il.com>
To: vincent.guittot@...aro.org,
	christian.loehle@....com
Cc: bsegall@...gle.com,
	dietmar.eggemann@....com,
	juri.lelli@...hat.com,
	linux-kernel@...r.kernel.org,
	mgorman@...e.de,
	mingo@...hat.com,
	peterz@...radead.org,
	realwujing@...il.com,
	rostedt@...dmis.org,
	vschneid@...hat.com,
	yuanql9@...natelecom.cn
Subject: Re: [PATCH v2 RSEND] sched/fair: Optimize EAS energy calculation complexity from O(N) to O(1) inside inner loop

Hi Christian, Vincent,

Thank you for the detailed feedback.

On Mon, Feb 02, 2026 at 10:48:04AM +0000, Christian Loehle wrote:
> Which is still O(n), I think the title is misleading.

On Tue, Feb 03, 2026 at 06:16:27PM +0100, Vincent Guittot wrote:
> Ok, but the whole feec() remains O(n)

You are absolutely right. While the per-candidate CPU energy estimation was 
optimized, the overall complexity of find_energy_efficient_cpu() remains 
O(N). I've renamed the patch in v3 to "Optimize EAS by reducing redundant 
performance domain scans" to more accurately reflect the scope of the 
improvement.

On Mon, Feb 02, 2026 at 10:48:04AM +0000, Christian Loehle wrote:
> I don't think this is actually true. EAS doesn't really work with a large 
> number of PDs because of the expensive wakeup path.
> I don't think there's an EAS system out there where this would actually make 
> a measurable impact.

On Tue, Feb 03, 2026 at 06:16:27PM +0100, Vincent Guittot wrote:
> Could you add some figures to highlight the statement above ?

In v3, I've further optimized the path by consolidating the 'pd_max_util' and 
'pd_busy_time' calculations into the same loop that finds the 
'max_spare_cap_cpu'. This reduces the total number of full PD scans from three 
down to one per performance domain.

I agree that the impact on current mobile systems with 2-3 PDs might be subtle. 
However, as topologies grow and the wake-up path becomes more sensitive to 
cache misses, reducing redundant scans of task structures and rq utilization 
is a worthwhile constant-factor improvement. I'm investigating synthetic 
benchmarks on systems with higher core counts to provide more concrete figures.

I've sent out v3 which includes these further logic consolidations.

v3 link: https://lore.kernel.org/all/20260204120509.3950227-1-realwujing@gmail.com/

Thanks,
Qiliang