lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180306231906.GB28911@amd>
Date:   Wed, 7 Mar 2018 00:19:06 +0100
From:   Pavel Machek <pavel@....cz>
To:     Daniel Lezcano <daniel.lezcano@...aro.org>
Cc:     edubezval@...il.com, kevin.wangtao@...aro.org, leo.yan@...aro.org,
        vincent.guittot@...aro.org, amit.kachhap@...il.com,
        linux-kernel@...r.kernel.org, javi.merino@...nel.org,
        rui.zhang@...el.com, daniel.thompson@...aro.org,
        linux-pm@...r.kernel.org, Jonathan Corbet <corbet@....net>,
        "open list:DOCUMENTATION" <linux-doc@...r.kernel.org>
Subject: Re: [PATCH V2 5/7] thermal/drivers/cpu_cooling: Add idle cooling
 device documentation

Hi!

> --- /dev/null
> +++ b/Documentation/thermal/cpu-idle-cooling.txt
> @@ -0,0 +1,165 @@
> +
> +Situation:
> +----------
> +

Can we have some real header here? Also if this is .rst, maybe it
should be marked so?

> +Under certain circumstances, the SoC reaches a temperature exceeding
> +the allocated power budget or the maximum temperature limit. The

I don't understand. Power budget is in W, temperature is in
kelvin. Temperature can't exceed power budget AFAICT.

> +former must be mitigated to stabilize the SoC temperature around the
> +temperature control using the defined cooling devices, the latter

later?

> +catastrophic situation where a radical decision must be taken to
> +reduce the temperature under the critical threshold, that can impact
> +the performances.

performance.

> +Another situation is when the silicon reaches a certain temperature
> +which continues to increase even if the dynamic leakage is reduced to
> +its minimum by clock gating the component. The runaway phenomena will
> +continue with the static leakage and only powering down the component,
> +thus dropping the dynamic and static leakage will allow the component
> +to cool down. This situation is critical.

Critical here, critical there. I have trouble following
it. Theoretically hardware should protect itself, because you don't
want kernel bug to damage your CPU?

> +Last but not least, the system can ask for a specific power budget but
> +because of the OPP density, we can only choose an OPP with a power
> +budget lower than the requested one and underuse the CPU, thus losing
> +performances. In other words, one OPP under uses the CPU with a

performance.

> +lesser than the power budget and the next OPP exceed the power budget,
> +an intermediate OPP could have been used if it were present.

was.

> +Solutions:
> +----------
> +
> +If we can remove the static and the dynamic leakage for a specific
> +duration in a controlled period, the SoC temperature will
> +decrease. Acting at the idle state duration or the idle cycle

"should" decrease? If you are in bad environment..

> +The Operating Performance Point (OPP) density has a great influence on
> +the control precision of cpufreq, however different vendors have a
> +plethora of OPP density, and some have large power gap between OPPs,
> +that will result in loss of performance during thermal control and
> +loss of power in other scenes.

scene seems to be wrong word here.

> +At a specific OPP, we can assume injecting idle cycle on all CPUs,

Extra comma?

> +Idle Injection:
> +---------------
> +
> +The base concept of the idle injection is to force the CPU to go to an
> +idle state for a specified time each control cycle, it provides
> +another way to control CPU power and heat in addition to
> +cpufreq. Ideally, if all CPUs of a cluster inject idle synchronously,
> +this cluster can get into the deepest idle state and achieve minimum
> +power consumption, but that will also increase system response latency
> +if we inject less than cpuidle latency.

I don't understand last sentence.

> +The mitigation begins with a maximum period value which decrease

decreases?
  
> +more cooling effect is requested. When the period duration is equal
> to
> +the idle duration, then we are in a situation the platform can’t
> +dissipate the heat enough and the mitigation fails. In this case

fast enough?

> +situation is considered critical and there is nothing to do. The idle

Nothing to do? Maybe power the system down?

> +The idle injection duration value must comply with the constraints:
> +
> +- It is lesser or equal to the latency we tolerate when the mitigation

less ... than the latency

> +Minimum period
> +--------------
> +
> +The idle injection duration being fixed, it is obvious the minimum
> +period can’t be lesser than that, otherwise we will be scheduling the

less.

> +Practically, if the running power is lesses than the targeted power,

less.

> +However, in this demonstration we ignore three aspects:
> +
> + * The static leakage is not defined here, we can introduce it in the
> +   equation but assuming it will be zero most of the time as it is

, but?

Best regards,
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

Download attachment "signature.asc" of type "application/pgp-signature" (182 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ