lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b777c6da-0fe2-96d2-240d-96b065a3f18d@arm.com>
Date:   Wed, 24 Apr 2019 16:57:14 +0100
From:   Ionela Voinescu <ionela.voinescu@....com>
To:     Thara Gopinath <thara.gopinath@...aro.org>, mingo@...hat.com,
        peterz@...radead.org, rui.zhang@...el.com
Cc:     linux-kernel@...r.kernel.org, amit.kachhap@...il.com,
        viresh.kumar@...aro.org, javi.merino@...nel.org,
        edubezval@...il.com, daniel.lezcano@...aro.org,
        vincent.guittot@...aro.org, nicolas.dechesne@...aro.org,
        bjorn.andersson@...aro.org, dietmar.eggemann@....com
Subject: Re: [PATCH V2 0/3] Introduce Thermal Pressure

Hi Thara,

The idea and the results look promising. I'm trying to understand better
the cause of the improvements so I've added below some questions that
would help me out with this.


> Regarding testing, basic build, boot and sanity testing have been
> performed on hikey960 mainline kernel with debian file system.
> Further, aobench (An occlusion renderer for benchmarking realworld
> floating point performance), dhrystone and hackbench test have been
> run with the thermal pressure algorithm. During testing, due to
> constraints of step wise governor in dealing with big little systems,
> cpu cooling was disabled on little core, the idea being that
> big core will heat up and cpu cooling device will throttle the
> frequency of the big cores there by limiting the maximum available
> capacity and the scheduler will spread out tasks to little cores as well.
> Finally, this patch series has been boot tested on db410C running v5.1-rc4
> kernel.
>

Did you try using IPA as well? It is better equipped to deal with
big-LITTLE systems and it's more probable IPA will be used for these
systems, where your solution will have the biggest impact as well.
The difference will be that you'll have both the big cluster and the
LITTLE cluster capped in different proportions depending on their
utilization and their efficiency.

> During the course of development various methods of capturing
> and reflecting thermal pressure were implemented.
> 
> The first method to be evaluated was to convert the
> capped max frequency into capacity and have the scheduler use the
> instantaneous value when updating cpu_capacity.
> This method is referenced as "Instantaneous Thermal Pressure" in the
> test results below. 
> 
> The next two methods employs different methods of averaging the
> thermal pressure before applying it when updating cpu_capacity.
> The first of these methods re-used the PELT algorithm already present
> in the kernel that does the averaging of rt and dl load and utilization.
> This method is referenced as "Thermal Pressure Averaging using PELT fmwk"
> in the test results below.
> 
> The final method employs an averaging algorithm that collects and
> decays thermal pressure based on the decay period. In this method,
> the decay period is configurable. This method is referenced as
> "Thermal Pressure Averaging non-PELT Algo. Decay : XXX ms" in the
> test results below.
> 
> The test results below shows 3-5% improvement in performance when
> using the third solution compared to the default system today where
> scheduler is unware of cpu capacity limitations due to thermal events.
> 

Did you happen to record the amount of capping imposed on the big cores
when these results were obtained? Did you find scenarios where the
capacity of the bigs resulted in being lower than the capacity of the
LITTLEs (capacity inversion)?
This is one case where we'll see a big impact in considering thermal
pressure.

Also, given that these are more or less sustained workloads, I'm
wondering if there is any effect on workloads running on an uncapped
system following capping. I would image such a test being composed of a
single threaded period (no capping) followed by a multi-threaded period
(with capping), continued in a loop. It might be interesting to have
something like this as well, as part of your test coverage.


Thanks,
Ionela.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ