[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGXk5yo4YOBQkt3DmWkipJgWqU5+00Ahsw_BaFJnwigR2iRmgA@mail.gmail.com>
Date: Wed, 5 Oct 2022 09:57:17 -0700
From: Wei Wang <wvw@...gle.com>
To: Dietmar Eggemann <dietmar.eggemann@....com>
Cc: Kajetan Puchalski <kajetan.puchalski@....com>,
Peter Zijlstra <peterz@...radead.org>,
Jian-Min Liu <jian-min.liu@...iatek.com>,
Ingo Molnar <mingo@...nel.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
Morten Rasmussen <morten.rasmussen@....com>,
Vincent Donnefort <vdonnefort@...gle.com>,
Quentin Perret <qperret@...gle.com>,
Patrick Bellasi <patrick.bellasi@...bug.net>,
Abhijeet Dharmapurikar <adharmap@...cinc.com>,
Qais Yousef <qais.yousef@....com>,
linux-kernel@...r.kernel.org,
Jonathan JMChen <jonathan.jmchen@...iatek.com>,
"Chung-Kai (Michael) Mei" <chungkai@...gle.com>
Subject: Re: [RFC PATCH 0/1] sched/pelt: Change PELT halflife at runtime
On Tue, Oct 4, 2022 at 2:33 AM Dietmar Eggemann
<dietmar.eggemann@....com> wrote:
>
> Hi Wei,
>
> On 04/10/2022 00:57, Wei Wang wrote:
>
> Please don't do top-posting.
>
Sorry, forgot this was posted to the list...
> > We have some data on an earlier build of Pixel 6a, which also runs a
> > slightly modified "sched" governor. The tuning definitely has both
> > performance and power impact on UX. With some additional user space
> > hints such as ADPF (Android Dynamic Performance Framework) and/or the
> > old-fashioned INTERACTION power hint, different trade-offs can be
> > archived with this sort of tuning.
> >
> >
> > +---------------------------------------------------------+----------+----------+
> > | Metrics | 32ms |
> > 8ms |
> > +---------------------------------------------------------+----------+----------+
> > | Sum of gfxinfo_com.android.test.uibench_deadline_missed | 185.00 |
> > 112.00 |
> > | Sum of SFSTATS_GLOBAL_MISSEDFRAMES | 62.00 |
> > 49.00 |
> > | CPU Power | 6,204.00 |
> > 7,040.00 |
> > | Sum of Gfxinfo.frame.95th | 582.00 |
> > 506.00 |
> > | Avg of Gfxinfo.frame.95th | 18.19 |
> > 15.81 |
> > +---------------------------------------------------------+----------+----------+
>
> Which App is package `gfxinfo_com.android.test`? Is this UIBench? Never
> ran it.
>
Yes.
> I'm familiar with `dumpsys gfxinfo <PACKAGE_NAME>`.
>
> # adb shell dumpsys gfxinfo <PACKAGE_NAME>
>
> ...
> ** Graphics info for pid XXXX [<PACKAGE_NAME>] **
> ...
> 95th percentile: XXms <-- (a)
> ...
> Number Frame deadline missed: XX <-- (b)
> ...
>
>
> I assume that `Gfxinfo.frame.95th` is related to (a) and
> `gfxinfo_com.android.test.uibench_deadline_missed` to (b)? Not sure
> where `SFSTATS_GLOBAL_MISSEDFRAMES` is coming from?
>
a) is correct b) is from surfaceflinger. Android display pipeline
involves both a) app (generation) and b) surfaceflinger
(presentation).
> What's the Sum here? Is it that you ran the test 32 times (582/18.19 = 32)?
>
Uibench[1] has several micro tests and it is the sum of those tests.
[1]: https://cs.android.com/android/platform/superproject/+/master:platform_testing/tests/microbenchmarks/uibench/src/com/android/uibench/microbenchmark/
> [...]
>
> > On Thu, Sep 29, 2022 at 11:59 PM Kajetan Puchalski
> > <kajetan.puchalski@....com> wrote:
> >>
> >> On Thu, Sep 29, 2022 at 01:21:45PM +0200, Peter Zijlstra wrote:
> >>> On Thu, Sep 29, 2022 at 12:10:17PM +0100, Kajetan Puchalski wrote:
> >>>
> >>>> Overall, the problem being solved here is that based on our testing the
> >>>> PELT half life can occasionally be too slow to keep up in scenarios
> >>>> where many frames need to be rendered quickly, especially on high-refresh
> >>>> rate phones and similar devices.
> >>>
> >>> But it is a problem of DVFS not ramping up quick enough; or of the
> >>> load-balancer not reacting to the increase in load, or what aspect
> >>> controlled by PELT is responsible for the improvement seen?
> >>
> >> Based on all the tests we've seen, jankbench or otherwise, the
> >> improvement can mainly be attributed to the faster ramp up of frequency
> >> caused by the shorter PELT window while using schedutil. Alongside that
> >> the signals rising faster also mean that the task would get migrated
> >> faster to bigger CPUs on big.LITTLE systems which improves things too
> >> but it's mostly the frequency aspect of it.
> >>
> >> To establish that this benchmark is sensitive to frequency I ran some
> >> tests using the 'performance' cpufreq governor.
> >>
> >> Max frame duration (ms)
> >>
> >> +------------------+-------------+----------+
> >> | kernel | iteration | value |
> >> |------------------+-------------+----------|
> >> | pelt_1 | 10 | 157.426 |
> >> | pelt_4 | 10 | 85.2713 |
> >> | performance | 10 | 40.9308 |
> >> +------------------+-------------+----------+
> >>
> >> Mean frame duration (ms)
> >>
> >> +---------------+------------------+---------+-------------+
> >> | variable | kernel | value | perc_diff |
> >> |---------------+------------------+---------+-------------|
> >> | mean_duration | pelt_1 | 14.6 | 0.0% |
> >> | mean_duration | pelt_4 | 14.5 | -0.58% |
> >> | mean_duration | performance | 4.4 | -69.75% |
> >> +---------------+------------------+---------+-------------+
> >>
> >> Jank percentage
> >>
> >> +------------+------------------+---------+-------------+
> >> | variable | kernel | value | perc_diff |
> >> |------------+------------------+---------+-------------|
> >> | jank_perc | pelt_1 | 2.1 | 0.0% |
> >> | jank_perc | pelt_4 | 2 | -3.46% |
> >> | jank_perc | performance | 0.1 | -97.25% |
> >> +------------+------------------+---------+-------------+
> >>
> >> As you can see, bumping up frequency can hugely improve the results
> >> here. This is what's happening when we decrease the PELT window, just on
> >> a much smaller and not as drastic scale. It also explains specifically
> >> where the increased power usage is coming from.
>
Powered by blists - more mailing lists