[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YzWuq5ShtJC6KWqe@e126311.manchester.arm.com>
Date: Thu, 29 Sep 2022 15:41:47 +0100
From: Kajetan Puchalski <kajetan.puchalski@....com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Jian-Min Liu <jian-min.liu@...iatek.com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Ingo Molnar <mingo@...nel.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
Morten Rasmussen <morten.rasmussen@....com>,
Vincent Donnefort <vdonnefort@...gle.com>,
Quentin Perret <qperret@...gle.com>,
Patrick Bellasi <patrick.bellasi@...bug.net>,
Abhijeet Dharmapurikar <adharmap@...cinc.com>,
Qais Yousef <qais.yousef@....com>,
linux-kernel@...r.kernel.org,
Jonathan JMChen <jonathan.jmchen@...iatek.com>
Subject: Re: [RFC PATCH 0/1] sched/pelt: Change PELT halflife at runtime
On Thu, Sep 29, 2022 at 01:21:45PM +0200, Peter Zijlstra wrote:
> On Thu, Sep 29, 2022 at 12:10:17PM +0100, Kajetan Puchalski wrote:
>
> > Overall, the problem being solved here is that based on our testing the
> > PELT half life can occasionally be too slow to keep up in scenarios
> > where many frames need to be rendered quickly, especially on high-refresh
> > rate phones and similar devices.
>
> But it is a problem of DVFS not ramping up quick enough; or of the
> load-balancer not reacting to the increase in load, or what aspect
> controlled by PELT is responsible for the improvement seen?
Based on all the tests we've seen, jankbench or otherwise, the
improvement can mainly be attributed to the faster ramp up of frequency
caused by the shorter PELT window while using schedutil. Alongside that
the signals rising faster also mean that the task would get migrated
faster to bigger CPUs on big.LITTLE systems which improves things too
but it's mostly the frequency aspect of it.
To establish that this benchmark is sensitive to frequency I ran some
tests using the 'performance' cpufreq governor.
Max frame duration (ms)
+------------------+-------------+----------+
| kernel | iteration | value |
|------------------+-------------+----------|
| pelt_1 | 10 | 157.426 |
| pelt_4 | 10 | 85.2713 |
| performance | 10 | 40.9308 |
+------------------+-------------+----------+
Mean frame duration (ms)
+---------------+------------------+---------+-------------+
| variable | kernel | value | perc_diff |
|---------------+------------------+---------+-------------|
| mean_duration | pelt_1 | 14.6 | 0.0% |
| mean_duration | pelt_4 | 14.5 | -0.58% |
| mean_duration | performance | 4.4 | -69.75% |
+---------------+------------------+---------+-------------+
Jank percentage
+------------+------------------+---------+-------------+
| variable | kernel | value | perc_diff |
|------------+------------------+---------+-------------|
| jank_perc | pelt_1 | 2.1 | 0.0% |
| jank_perc | pelt_4 | 2 | -3.46% |
| jank_perc | performance | 0.1 | -97.25% |
+------------+------------------+---------+-------------+
As you can see, bumping up frequency can hugely improve the results
here. This is what's happening when we decrease the PELT window, just on
a much smaller and not as drastic scale. It also explains specifically
where the increased power usage is coming from.
Powered by blists - more mailing lists