linux-kernel - Re: [RFC PATCH v2 1/7] Revert "sched/uclamp: Set max_spare_cap_cpu even if max_spare

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <37be0494-7e38-4275-b6eb-62a2eb2f6d46@arm.com>
Date: Tue, 19 Mar 2024 16:34:16 +0100
From: Dietmar Eggemann <dietmar.eggemann@....com>
To: Hongyan Xia <hongyan.xia2@....com>, Ingo Molnar <mingo@...hat.com>,
 Peter Zijlstra <peterz@...radead.org>,
 Vincent Guittot <vincent.guittot@...aro.org>,
 Juri Lelli <juri.lelli@...hat.com>, Steven Rostedt <rostedt@...dmis.org>,
 Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
 Daniel Bristot de Oliveira <bristot@...hat.com>,
 Valentin Schneider <vschneid@...hat.com>
Cc: Qais Yousef <qyousef@...alina.io>,
 Morten Rasmussen <morten.rasmussen@....com>,
 Lukasz Luba <lukasz.luba@....com>,
 Christian Loehle <christian.loehle@....com>, linux-kernel@...r.kernel.org,
 David Dai <davidai@...gle.com>, Saravana Kannan <saravanak@...gle.com>
Subject: Re: [RFC PATCH v2 1/7] Revert "sched/uclamp: Set max_spare_cap_cpu
 even if max_spare_cap is 0"

On 01/02/2024 14:11, Hongyan Xia wrote:
> From: Hongyan Xia <Hongyan.Xia2@....com>
> 
> That commit creates further problems because 0 spare capacity can be
> either a real indication that the CPU is maxed out, or the CPU is
> UCLAMP_MAX throttled, but we end up giving all of them a chance which
> can results in bogus energy calculations. It also tends to schedule
> tasks on the same CPU and requires load balancing patches. Sum
> aggregation solves these problems and this patch is not needed.
> 
> This reverts commit 6b00a40147653c8ea748e8f4396510f252763364.

I assume you did this revert especially for the 'Scenario 5: 8 tasks
with UCLAMP_MAX of 120' testcase?

IMHO, the issue is especially visible in compute_energy()'s busy_time
computation with a valid destination CPU (dst_cpu >= 0). I.e. when we
have to add performance domain (pd) and task busy time.

find_energy_efficient_cpu() (feec())

 for each pd
  for each cpu in pd

   set {prev_,max}_spare_cap

 bail if prev_ and max_spare_cap < 0 (was == 0 before )

 {base_,prev_,cur_}energy = compute_energy

So with the patch we potentially compute energy for a saturated PD
according:

 compute_energy()

  if (dst_cpu >= 0)
   busy_time = min(eenv->pd_cap, eenv->busy_time + eenv->task_busy_time)
                   <----(a)--->  <--------------(b)------------------->

  energy = em_cpu_energy(pd->em_pd, max_util, busy_time, eenv->cpu_cap)

If (b) > (a) then we're saturated and 'energy' is bogus.

The way to fix this is up for discussion:

(1) feec() returning prev_cpu
(2) feec() returning -1 (forcing wakeup into sis() -> sic())
(3) using uclamped values for task and rq utilization

None of those have immediately given the desired task placement on
mainline (2 tasks on each of the 4 little CPUs and no task on the 2 big
CPUs on my [l B B l l l] w/ CPU capacities = [446 1024 1024 446 446 446]
machine) you can achieve with uclamp sum aggregation.

[...]