lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5CC2DC51.9030908@linaro.org>
Date:   Fri, 26 Apr 2019 06:24:17 -0400
From:   Thara Gopinath <thara.gopinath@...aro.org>
To:     Ionela Voinescu <ionela.voinescu@....com>,
        Quentin Perret <quentin.perret@....com>
Cc:     mingo@...hat.com, peterz@...radead.org, rui.zhang@...el.com,
        linux-kernel@...r.kernel.org, amit.kachhap@...il.com,
        viresh.kumar@...aro.org, javi.merino@...nel.org,
        edubezval@...il.com, daniel.lezcano@...aro.org,
        vincent.guittot@...aro.org, nicolas.dechesne@...aro.org,
        bjorn.andersson@...aro.org, dietmar.eggemann@....com
Subject: Re: [PATCH V3 3/3] thermal/cpu-cooling: Update thermal pressure in
 case of a maximum frequency capping

On 04/24/2019 11:56 AM, Ionela Voinescu wrote:
> Hi guys,
> 
> On 23/04/2019 23:38, Thara Gopinath wrote:
>> On 04/18/2019 05:48 AM, Quentin Perret wrote:
>>> On Tuesday 16 Apr 2019 at 15:38:41 (-0400), Thara Gopinath wrote:
>>>> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
>>>> @@ -177,6 +178,9 @@ static int cpufreq_thermal_notifier(struct notifier_block *nb,
>>>>  
>>>>  		if (policy->max > clipped_freq)
>>>>  			cpufreq_verify_within_limits(policy, 0, clipped_freq);
>>>> +
>>>> +		sched_update_thermal_pressure(policy->cpus,
>>>> +				policy->max, policy->cpuinfo.max_freq);
>>>
>>> Is this something we could do this CPUFreq ? Directly in
>>> cpufreq_verify_within_limits() perhaps ?
>>>
>>> That would re-define the 'thermal pressure' framework in a more abstract
>>> way and make the scheduler look at 'frequency capping' events,
>>> regardless of the reason for capping.
>>>
>>> That would reflect user-defined frequency constraint into cpu_capacity,
>>> in addition to the thermal stuff. I'm not sure if there is another use
>>> case for frequency capping ?
>> Hi Quentin,
>> Thanks for the review. Sorry for the delay in response as I was on
>> vacation for the past few days.
>> I think there is one major difference between user-defined frequency
>> constraints and frequency constraints due to thermal events in terms of
>> the time period the system spends in the the constraint state.
>> Typically, a user constraint lasts for seconds if not minutes and I
>> think in this case cpu_capacity_orig should reflect this constraint and
>> not cpu_capacity like this patch set. Also, in case of the user
>> constraint, there is possibly no need to accumulate and average the
>> capacity constraints and instantaneous values can be directly applied to
>> cpu_capacity_orig. On the other hand thermal pressure is more spiky and
>> sometimes in the order of ms and us requiring the accumulating and
>> averaging.
> 
> I think we can't make any assumptions in regards to the intentions of
> the user when restricting the OPP range though the cpufreq interface,
> but it would still be nice to do something and reflecting it as thermal
> pressure would be a good start. It might not be due to thermal, but it
> is a capacity restriction that would have the same result. Also, if the
> user has the ability to tune the decay period he has the control over
> the behavior of the signal. Given that currently there isn't a smarter
> mechanism (modifying capacity orig, re-normalising the capacity range)
> for long-term capping, even treating it as short-term capping is a good
> start. But this is a bigger exercise and it needs thorough
> consideration, so it could be skipped, in my opinion, for now.. 
> 
> Also, if we want to stick with the "definition", userspace would still
> be able to reflect thermal pressure though the thermal limits interface
> by setting the cooling device state, which will be reflected in this
> update as well. So userspace would have a mechanism to reflect thermal
> pressure.

Yes, target_state under cooling devices can be set and this will reflect
as thermal pressure.

> 
> One addition.. I like that the thermal pressure framework is not tied to
> cpufreq. There are firmware solutions that do not bother informing
> cpufreq of limits being changed, and therefore all of this could be
> skipped. But any firmware driver could call sched_update_thermal_pressure
> on notifications for limits changing from firmware, which is an
> important feature.

For me, I am open to discussion on the best place to call
sched_update_thermal_pressure from. Seeing the discussion and different
opinions, I am wondering should there be a SoC or platform specific hook
provided for better abstraction.

Regards
Thara

> 
>>>
>>> Perhaps the Intel boost stuff could be factored in there ? That is,
>>> at times when the boost freq is not reachable capacity_of() would appear
>>> smaller ... Unless this wants to be reflected instantaneously ?
>> Again, do you think intel boost is more applicable to be reflected in
>> cpu_capacity_orig and not cpu_capacity?
>>>
>>> Thoughts ?
>>> Quentin
>>>
>>
> 
> The changes here would happen even faster than thermal capping, same as
> other restrictions imposed by firmware, so it would not seem right to me
> to reflect it in capacity_orig. Reflecting it as thermal pressure is
> another matter, which I'd say it should be up to the client. The big
> disadvantage I'd see for this is coping with decisions made while being
> capped, when you're not capped any longer, and the other way around. I
> believe these changes would happen too often and they will not happen in
> a ramp-up/ramp-down behavior that we expect from thermal mitigation.
> That's why I believe averaging/regulation of the signal works well in
> this case, and it might not for power related fast restrictions.
> 
> But given these three cases above, it might be that the ideal solution
> is for this framework to be made more generic and for each client to be
> able to obtain and configure a pressure signal to be reflected
> separately in the capacity of each CPU.
> 
> My two pennies' worth,
> Ionela.
> 
> 
> 


-- 
Regards
Thara

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ