linux-kernel - Re: [PATCH v2 1/3] sched/uclamp: Set max_spare_cap_cpu even if max_spare

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c5722699-d366-3f26-635d-a45f746a3658@arm.com>
Date:   Wed, 7 Jun 2023 15:52:00 +0100
From:   Hongyan Xia <hongyan.xia2@....com>
To:     Qais Yousef <qyousef@...alina.io>,
        Dietmar Eggemann <dietmar.eggemann@....com>
Cc:     Vincent Guittot <vincent.guittot@...aro.org>,
        Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        linux-kernel@...r.kernel.org, Lukasz Luba <lukasz.luba@....com>,
        Wei Wang <wvw@...gle.com>, Xuewen Yan <xuewen.yan94@...il.com>,
        Hank <han.lin@...iatek.com>,
        Jonathan JMChen <Jonathan.JMChen@...iatek.com>
Subject: Re: [PATCH v2 1/3] sched/uclamp: Set max_spare_cap_cpu even if
 max_spare_cap is 0

Hi Qais,

On 2023-02-11 17:50, Qais Yousef wrote:
> [...]
>>
>> So EAS keeps packing on the cheaper PD/clamped OPP.
> 
> Which is the desired behavior for uclamp_max?
> 
> The only issue I see is that we want to distribute within a pd. Which is
> something I was going to work on and send after later - but can lump it in this
> series if it helps.

I more or less share the same concern with Dietmar, which is packing 
things on the same small CPU when everyone has spare cpu_cap of 0.

I wonder if this could be useful: On the side of cfs_rq->avg.util_avg, 
we have a cfs_rq->avg.util_avg_uclamp_max. It is keeping track of 
util_avg, but each task on the rq is capped at its uclamp_max value, so 
even if there's two always-running tasks with uclamp_max values of 100 
with no idle time, the cfs_rq only sees cpu_util() of 200 and still has 
remaining capacity of 1024 - 200, not 0. This also helps balancing the 
load when rqs have no idle time. Even if two CPUs both have no idle 
time, but one is running a single task clamped at 100, the other running 
2 such tasks, the first sees a remaining capacity of 1024 - 100, while 
the 2nd is 1024 - 200, so we still prefer the first one.

And I wonder if this could also help calculating energy when there's no 
idle time under uclamp_max. Instead of seeing a util_avg at 1024, we 
actually see a lower value. This is also what cpu_util_next() does in 
Android's sum aggregation, but I'm thinking of maintaining it right 
beside util_avg so that we don't have to sum up everything every time.

Hongyan