lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 16 Jun 2021 13:47:38 +0300
From:   Dmitry Osipenko <digetx@...il.com>
To:     Thara Gopinath <thara.gopinath@...aro.org>,
        Daniel Lezcano <daniel.lezcano@...aro.org>,
        Viresh Kumar <viresh.kumar@...aro.org>
Cc:     Thierry Reding <thierry.reding@...il.com>,
        Jonathan Hunter <jonathanh@...dia.com>,
        Zhang Rui <rui.zhang@...el.com>,
        Amit Kucheria <amitk@...nel.org>,
        Andreas Westman Dorcsak <hedmoo@...oo.com>,
        Maxim Schwalm <maxim.schwalm@...il.com>,
        Svyatoslav Ryhel <clamor95@...il.com>,
        Ihor Didenko <tailormoon@...bler.ru>,
        Ion Agorria <ion@...rria.com>,
        Matt Merhar <mattmerhar@...tonmail.com>,
        Peter Geis <pgwipeout@...il.com>, devicetree@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-tegra@...r.kernel.org,
        linux-pm@...r.kernel.org
Subject: Re: [PATCH v3 4/7] thermal/drivers/tegra: Add driver for Tegra30
 thermal sensor

16.06.2021 05:50, Thara Gopinath пишет:
...
> 
> Hi,
> 
> Thermal pressure is letting scheduler know that the max capacity
> available for a cpu to schedule tasks is reduced due to a thermal event.
> So you cannot have a h/w thermal pressure and s/w thermal pressure.
> There is eventually only one capping applied at h/w level and the
> frequency corresponding to this capping should be used for thermal
> pressure.
> 
> Ideally you should not be having both s/w and h/w trying to throttle at
> the same time. Why is this a scenario and what prevents you from
> disabling s/w throttling when h/w throttling is enabled. Now if there
> has to a aggregation for whatever reason this should be done at the
> thermal driver level and passed to scheduler.

Hello,

The h/w mitigation is much more reactive than software, in the same time
it's much less flexible than software. It should provide additional
protection in a cases where software isn't doing a good job. Ideally h/w
mitigation should stay inactive all the time, nevertheless it should be
modeled properly by the driver.

>>>
>>> That is a good question. IMO, first step would be to call
>>> cpufreq_update_limits().
>>
>> Right
>>
>>> [ Cc Thara who implemented the thermal pressure ]
>>>
>>> May be Thara has an idea about how to aggregate both? There is another
>>> series floating around with hardware limiter [1] and the same
>>> problematic.
>>>
>>>   [1] https://lkml.org/lkml/2021/6/8/1791
>>
>> Thanks, it indeed looks similar.
>>
>> I guess the common thermal pressure update code could be moved out into
>> a new special cpufreq thermal QoS handler (policy->thermal_constraints),
>> where handler will select the frequency constraint and set up the
>> pressure accordingly. So there won't be any races in the code.
>>
> It was a conscious decision to keep thermal pressure update out of qos
> max freq update because there are platforms that don't use the qos
> framework. For eg acpi uses cpufreq_update_policy.
> But you are right. We have two platforms now applying h/w throttling and
> cpufreq_cooling applying s/w throttling. So it does make sense to have
> one api doing all the computation to update thermal pressure. I am not
> sure how exactly/where exactly this will reside.

The generic cpufreq_cooling already uses QoS for limiting the CPU
frequency. It could be okay to use QoS for the OF drivers, this needs a
closer look.

We have the case where CPU frequency is changed by the thermal event and
the thermal pressure equation is the same for both s/w cpufreq_cooling
and h/w thermal driver. The pressure is calculated based on the QoS
cpufreq constraint that is already aggregated.

Hence what we may need to do on the thermal event is:

1. Update the QoS request
2. Update the thermal pressure
3. Ensure that updates are not racing

> So for starters, I think you should replicate the update of thermal
> pressure in your h/w driver when you know that h/w is
> throttling/throttled the frequency. You can refer to cpufreq_cooling.c
> to see how it is done.
> 
> Moving to a common api can be done as a separate patch series.
> 

Thank you for the clarification and suggestion.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ