linux-kernel - Re: [RFC PATCH v2 0/7] uclamp sum aggregation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1cc2b8c4-ea0e-4e98-a1a3-2916cccb1ab1@arm.com>
Date: Tue, 6 Feb 2024 17:32:42 +0000
From: Hongyan Xia <hongyan.xia2@....com>
To: Qais Yousef <qyousef@...alina.io>
Cc: Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
 Vincent Guittot <vincent.guittot@...aro.org>,
 Dietmar Eggemann <dietmar.eggemann@....com>,
 Morten Rasmussen <morten.rasmussen@....com>,
 Lukasz Luba <lukasz.luba@....com>,
 Christian Loehle <christian.loehle@....com>, linux-kernel@...r.kernel.org,
 David Dai <davidai@...gle.com>, Saravana Kannan <saravanak@...gle.com>
Subject: Re: [RFC PATCH v2 0/7] uclamp sum aggregation

On 06/02/2024 15:20, Qais Yousef wrote:
> On 02/01/24 13:11, Hongyan Xia wrote:
> 
>> [1]: https://lore.kernel.org/all/20230331014356.1033759-1-davidai@google.com/
> 
> Their solution is not acceptable for the same reason yours isn't. Saravana and
> David know this and we discussed at LPC. uclamp hints are limits and should not
> be summed.

Uclamp is a performance hint and nothing in its definition says it can't 
be summed. Clearly whether a uclamp approach should be taken should be 
determined by how well it works as a hint, not by how we calculate it. I 
would not say I want to reject max aggregation simply because it throws 
away all other uclamp values except the max. It's because I have real 
evaluation results showing sum aggregation works as a much better hint.

>> [2]: https://android.googlesource.com/kernel/gs/+/refs/heads/android-gs-raviole-5.10-android12-d1/drivers/soc/google/vh/kernel/sched/fair.c#510
> 
> I think I clarified several times so far that this is not related to uclamp.
> Could you please refrain from referring to it again in the future? This is
> misleading and neither helps your cause nor its cause. The fact that you're
> relating to it makes me very worried as both links demonstrate lack of
> understanding/confusion of what uclamp is supposed to be.

The intention of the code is irrelevant. What I'm talking about is what 
effect the code actually has. The fact that you keep thinking I don't 
understand what the code does even after me explaining "I know what the 
intention of the code is, I'm just talking about the actual effect of 
the code" is an even more worrying sign.

> Again, this solution is not acceptable and you're moving things in the wrong
> direction. We don't want to redesign what uclamp means, but fix some corner
> cases. And you're doing the former not the latter.

I'm saying max aggregation is not effective and proposing a more 
effective implementation. In fact, you have sent a series that removes 
max aggregation. Clearly that does not count as fixing corner cases but 
is actually a redesign, and I don't understand why you are allowed to do 
such things and I am not. Also, when something becomes harder and harder 
to fix, a redesign that solves all the problems is clearly justified.

What I can summarize from sum aggregation is:

Pros:
1. A more effective implementation, proven by evaluation numbers
2. Consuming the same or even less power in benchmarks
3. 350 lines of code in total, less than half of max aggregation
4. This series shows the entirety and effectiveness of sum aggregation, 
at this very moment, today. Max aggregation needs further filtering and 
load balancing patches which we have not seen yet.
5. Resolves the drawbacks from max aggregation (which you might say is 
the same as 4)
6. Significantly reduces uclamp overhead, no bucket operations

Cons:
1. should not be summed (although the scheduler used to sum up 
utilization and util_est sums up a processed PELT signal today)
2. Under-utilization case (which is a problem GROUP_THROTTLE also has, 
and can be worked around. Please, I know the intention of 
GROUP_THROTTLE, I'm just talking about its actual effects).

I don't see why the things I listed above is in the wrong direction.