linux-kernel - Re: [PATCH v2 1/3] sched/uclamp: Set max_spare_cap_cpu even if max_spare

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <16893160-bf0d-0688-b29e-6bd75be46990@arm.com>
Date:   Mon, 5 Jun 2023 14:07:28 +0100
From:   Lukasz Luba <lukasz.luba@....com>
To:     Qais Yousef <qyousef@...alina.io>
Cc:     Vincent Guittot <vincent.guittot@...aro.org>,
        Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        linux-kernel@...r.kernel.org, Wei Wang <wvw@...gle.com>,
        Xuewen Yan <xuewen.yan94@...il.com>,
        Hank <han.lin@...iatek.com>,
        Jonathan JMChen <Jonathan.JMChen@...iatek.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>
Subject: Re: [PATCH v2 1/3] sched/uclamp: Set max_spare_cap_cpu even if
 max_spare_cap is 0



On 5/31/23 19:22, Qais Yousef wrote:
> Hi Lukasz!
> 
> Sorry for late response..
> 
> On 05/22/23 09:30, Lukasz Luba wrote:
>> Hi Qais,
>>
>> I have a question regarding the 'soft cpu affinity'.
> 
> [...]
> 
>>> IIUC I'm not seeing this being a problem. The goal of capping with uclamp_max
>>> is two folds:
>>>
>>> 	1. Prevent tasks from consuming energy.
>>> 	2. Keep them away from expensive CPUs.
>>>
>>> 2 is actually very important for 2 reasons:
>>>
>>> 	a. Because of max aggregation - any uncapped tasks that wakes up will
>>> 	   cause a frequency spike on this 'expensive' cpu. We don't have
>>> 	   a mechanism to downmigrate it - which is another thing I'm working
>>> 	   on.
>>> 	b. It is desired to keep these bigger cpu idle ready for more important
>>> 	   work.
>>>
>>> For 2, generally we don't want these tasks to steal bandwidth from these CPUs
>>> that we'd like to preserve for other type of work.
>>
>> I'm a bit afraid about such 'strong force'. That means the task would
>> not go via EAS if we set uclamp_max e.g. 90, while the little capacity
>> is 125. Or am I missing something?
> 
> We should go via EAS, actually that's the whole point.
> 
> Why do you think we won't go via EAS? The logic should be is we give a hint to
> prefer the little core, but we still can pick something else if it's more
> energy efficient.
> 
> What uclamp_max enables us is to still consider that little core even if it's
> utilization says it doesn't fit there. We need to merge these patches first
> though as it's broken at the moment. if little capacity is 125 and utilization
> of the task is 125, then even if uclamp_max is 0, EAS will skip the whole
> little cluster as apotential candidate because there's no spare_capacity there.
> Even if the whole little cluster is idle.

OK, I see now - it's a bug then.

> 
>>
>> This might effectively use more energy for those tasks which can run on
>> any CPU and EAS would figure a good energy placement. I'm worried
>> about this, since we have L3+littles in one DVFS domain and the L3
>> would be only bigger in future.
> 
> It's a bias that will enable the search algorithm in EAS to still consider the
> little core for big tasks. This bias will depend on the uclamp_max value chosen
> by userspace (so they have some control on how hard to cap the task), and what
> else is happening in the system at the time it wakes up.

OK, so we would go via EAS and check the energy impact in 3 PDs - which
is desired.

> 
>>
>> IMO to keep the big cpus more in idle, we should give them big energy
>> wake up cost. That's my 3rd feature to the EM presented in OSPM2023.
> 
> Considering the wake up cost in EAS would be a great addition to have :)
> 
>>
>>>
>>> Of course userspace has control by selecting the right uclamp_max value. They
>>> can increase it to allow a spill to next pd - or keep it low to steer them more
>>> strongly on a specific pd.
>>
>> This would we be interesting to see in practice. I think we need such
>> experiment, for such changes.
> 
> I'm not sure what you mean by 'such changes'. I hope you don't mean these
> patches as they are not the key. They fix an obvious bug where task placement
> hint won't work at all. They don't modify any behavior that shouldn't have
> already been there. Nor introduce new limitation. I have to say I am
> disappointed that these patches aren't considered an important fix for an
> obvious breakage.

I mean, in practice - in our pixel6 test 3-gear :)

Thank for explanation.