lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ad78436f-43c3-4b4b-9cb5-28dffd43468a@arm.com>
Date:   Thu, 2 Nov 2023 18:37:33 +0100
From:   Dietmar Eggemann <dietmar.eggemann@....com>
To:     Hongyan Xia <Hongyan.Xia2@....com>, Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Juri Lelli <juri.lelli@...hat.com>
Cc:     Qais Yousef <qyousef@...alina.io>,
        Morten Rasmussen <morten.rasmussen@....com>,
        Lukasz Luba <lukasz.luba@....com>,
        Christian Loehle <christian.loehle@....com>,
        linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH 4/6] sched/fair: Rewrite util_fits_cpu()

On 04/10/2023 11:04, Hongyan Xia wrote:
> From: Hongyan Xia <hongyan.xia2@....com>
> 
> Currently, there's no way to distinguish the difference between 1) a CPU
> that is actually maxed out at its highest frequency, or 2) one that is
> throttled because of UCLAMP_MAX, since both present util_avg values of
> 1024. This is problematic because when we try to pick a CPU for a task
> to run, we would like to give 2) a chance, or at least prefer 2) to 1).
> 
> Current upstream gives neither a chance because the spare capacity is 0
> for either case. There are patches to fix this problem by considering 0
> capacities [1], but this might still be inefficient because this ends
> up treating 1) and 2) equally, and will always pick the same one because
> we don't change how we iterate through all CPUs. If we end up putting
> many tasks on 1), then this creates a seriously unbalanced load for the
> two CPUs.
> 
> Fix by using util_avg_uclamp for util_fits_cpu(). This way, case 1) will
> still keep its utilization at 1024 whereas 2) shows spare capacities if
> the sum of util_avg_uclamp values is still under the CPU capacity.
> Note that this is roughly what the sum aggregation does in the Android
> kernel [2] (although we clamp UCLAMP_MIN as well in this patch, which
> may need some discussions), which shows superior energy savings because
> there's more chance that a task can get scheduled on 2) instead of
> finding a big CPU to run on.
> 
> Under sum aggregation, checking whether a task fits a CPU becomes much
> simpler. We simply do fits_capacity() and there does not need to be code
> checking all corner cases for uclamp. This means util_fits_cpu() returns
> to true and false instead of tri-state, simplifying a significant amount
> of code.

You could remove util_fits_cpu() and task_fits_cpu() and call
fits_capacity() directly. We should try to keep the zoo of util-related
functions as small as possible.

[...]

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ