linux-kernel - Re: [RFC PATCH 0/7] sched: cpufreq: Remove magic margins

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <44fc6d03-c663-53de-e4f7-e56687c5718d@arm.com>
Date:   Fri, 8 Sep 2023 09:40:35 +0200
From:   Dietmar Eggemann <dietmar.eggemann@....com>
To:     Qais Yousef <qyousef@...alina.io>,
        Peter Zijlstra <peterz@...radead.org>
Cc:     Ingo Molnar <mingo@...nel.org>,
        "Rafael J. Wysocki" <rafael@...nel.org>,
        Viresh Kumar <viresh.kumar@...aro.org>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        linux-kernel@...r.kernel.org, linux-pm@...r.kernel.org,
        Lukasz Luba <lukasz.luba@....com>
Subject: Re: [RFC PATCH 0/7] sched: cpufreq: Remove magic margins

On 08/09/2023 02:17, Qais Yousef wrote:
> On 09/07/23 15:08, Peter Zijlstra wrote:
>> On Mon, Aug 28, 2023 at 12:31:56AM +0100, Qais Yousef wrote:

[...]

> But for the 0.8 and 1.25 margin problems, actually the problem is that 25% is
> too aggressive/fast and wastes power. I'm actually slowing things down as
> a result of this series. And I'm expecting some not to be happy about it on
> their systems. The response_time_ms was my way to give back control. I didn't
> see how I can make things faster and slower at the same time without making
> decisions on behalf of the user/sysadmin.
> 
> So the connection I see between PELT and the margins or headrooms in
> fits_capacity() and map_util_perf()/dvfs_headroom is that they expose the need
> to manage the perf/power trade-off of the system.
> 
> Particularly the default is not good for the modern systems, Cortex-X is too
> powerful but we still operate within the same power and thermal budgets.
> 
> And what was a high end A78 is a mid core today. So if you look at today's
> mobile world topology we really have a tiy+big+huge combination of cores. The
> bigs are called mids, but they're very capable. Fits capacity forces migration
> to the 'huge' cores too soon with that 80% margin. While the 80% might be too
> small for the tiny ones as some workloads really struggle there if they hang on
> for too long. It doesn't help that these systems ship with 4ms tick. Something
> more to consider changing I guess.

If this is the problem then you could simply make the margin (headroom)
a function of cpu_capacity_orig?

[...]

> There's a question that I'm struggling with if I may ask. Why is it perceived
> our constant response time (practically ~200ms to go from 0 to max) as a good
> fit for all use cases? Capability of systems differs widely in terms of what
> performance you get at say a util of 512. Or in other words how much work is
> done in a unit of time differs between system, but we still represent that work
> in a constant way. A task ran for 10ms on powerful System A would have done

PELT (util_avg) is uarch & frequency invariant.

So e.g. a task with util_avg = 256 could have a runtime/period

on big CPU (capacity = 1024) of 4ms/16ms

on little CPU (capacity = 512) of 8ms/16ms

The amount of work in invariant (so we can compare between asymmetric
capacity CPUs) but the runtime obviously differs according to the capacity.

[...]