linux-kernel - Re: [PATCH v3 0/3] cpuidle: teo: Avoid stopping scheduler tick too often

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAJZ5v0grLBbRNJHq=_OvC3HqE3BEy=BOwgde_gPk2qyUOWKuZQ@mail.gmail.com>
Date:   Wed, 2 Aug 2023 14:20:27 +0200
From:   "Rafael J. Wysocki" <rafael@...nel.org>
To:     Kajetan Puchalski <kajetan.puchalski@....com>
Cc:     "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Linux PM <linux-pm@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Anna-Maria Behnsen <anna-maria@...utronix.de>,
        Frederic Weisbecker <frederic@...nel.org>
Subject: Re: [PATCH v3 0/3] cpuidle: teo: Avoid stopping scheduler tick too often

On Tue, Aug 1, 2023 at 11:53 PM Kajetan Puchalski
<kajetan.puchalski@....com> wrote:
>
> Hi Rafael,
>
> > Hi Folks,
> >
> > Patch [1/3] in this series is a v3 of this patch posted last week:
> >
> > https://lore.kernel.org/linux-pm/4506480.LvFx2qVVIh@kreacher/
> >
> > Patch [2/3] (this is the second version of it) addresses some bail out paths
> > in teo_select() in which the scheduler tick may be stopped unnecessarily too.
> >
> > Patch [3/3] replaces a structure field with a local variable (while at it)
> > and it is the same as its previous version.
> >
> > According to this message:
> >
> > https://lore.kernel.org/linux-pm/CAJZ5v0jJxHj65r2HXBTd3wfbZtsg=_StzwO1kA5STDnaPe_dWA@mail.gmail.com/
> >
> > this series significantly reduces the number of cases in which the governor
> > requests stopping the tick when the selected idle state is shallow, which is
> > incorrect.
> >
> > Thanks!
> >
> >
>
> I did some initial testing with this on Android (Pixel 6, Android 13).
>
> 1. Geekbench 6
>
> +---------------------------+---------------+-----------------+
> |          metric           |      teo      |     teo_tick    |
> +---------------------------+---------------+-----------------+
> |      multicore_score      | 3320.9 (0.0%) | 3303.3 (-0.53%) |
> |           score           | 1415.7 (0.0%) | 1417.7 (0.14%)  |
> |      CPU_total_power      | 2421.3 (0.0%) | 2429.3 (0.33%)  |
> |  latency (AsyncTask #1)   | 49.41μ (0.0%) | 51.07μ (3.36%)  |
> | latency (labs.geekbench6) | 65.63μ (0.0%) | 77.47μ (18.03%) |
> | latency (surfaceflinger)  | 39.46μ (0.0%) | 36.94μ (-6.39%) |
> +---------------------------+---------------+-----------------+
>
> So the big picture for this workload looks roughly the same, the
> differences are too small for me to be confident in saying that the
> score/power difference is the result of the patches and not something
> random in the system.
> Same with the latency, the difference for labs.gb6 stands out but that's
> a pretty irrelevant task that sets up the benchmark, not the benchmark
> itself so not the biggest deal I think.
>
> +---------------+---------+------------+--------+
> |     kernel    | cluster | idle_state |  time  |
> +---------------+---------+------------+--------+
> |      teo      | little  |    0.0     | 146.75 |
> |      teo      | little  |    1.0     | 53.75  |
> |    teo_tick   | little  |    0.0     |  63.5  |
> |    teo_tick   | little  |    1.0     | 146.78 |
> +---------------+---------+------------+--------+
>
> +---------------+-------------+------------+
> |     kernel    |    type     | count_perc |
> +---------------+-------------+------------+
> |   teo         |  too deep   |   2.034    |
> |   teo         | too shallow |   15.791   |
> |   teo_tick    |  too deep   |    2.16    |
> |   teo_tick    | too shallow |   20.881   |
> +---------------+-------------+------------+
>
> The difference shows up in the idle numbers themselves, looks like we
> get a big shift towards deeper idle on our efficiency cores (little
> cluster) and more missed wakeups overall, both too deep & too shallow.
>
> Notably, the percentage of too shallow sleeps on the performance cores has
> more or less doubled (2% + 0.8% -> 4.3% + 1.8%). This doesn't
> necessarily have to be an issue but I'll do more testing just in case.
>
> 2. JetNews (Light UI workload)
>
> +------------------+---------------+----------------+
> |      metric      |      teo      |    teo_tick    |
> +------------------+---------------+----------------+
> |       fps        |  86.2 (0.0%)  |  86.4 (0.16%)  |
> |     janks_pc     |  0.8 (0.0%)   |  0.8 (-0.00%)  |
> | CPU_total_power  | 185.2 (0.0%)  | 178.2 (-3.76%) |
> +------------------+---------------+----------------+
>
> For the UI side, the frame data comes out the same on both variants but
> alongside better power usage which is nice to have.
>
> +---------------+---------+------------+-------+
> |    kernel     | cluster | idle_state | time  |
> +---------------+---------+------------+-------+
> |      teo      | little  |    0.0     | 25.06 |
> |      teo      | little  |    1.0     | 12.21 |
> |      teo      |   mid   |    0.0     | 38.32 |
> |      teo      |   mid   |    1.0     | 17.82 |
> |      teo      |   big   |    0.0     | 30.45 |
> |      teo      |   big   |    1.0     | 38.5  |
> |    teo_tick   | little  |    0.0     | 23.18 |
> |    teo_tick   | little  |    1.0     | 14.21 |
> |    teo_tick   |   mid   |    0.0     | 36.31 |
> |    teo_tick   |   mid   |    1.0     | 19.88 |
> |    teo_tick   |   big   |    0.0     | 27.13 |
> |    teo_tick   |   big   |    1.0     | 42.09 |
> +---------------+---------+------------+-------+
>
> +---------------+-------------+------------+
> |    kernel     |    type     | count_perc |
> +---------------+-------------+------------+
> |      teo      |  too deep   |   0.992    |
> |      teo      | too shallow |   17.085   |
> |   teo_tick    |  too deep   |   0.945    |
> |   teo_tick    | too shallow |   15.236   |
> +---------------+-------------+------------+
>
> For the idle stuff here all 3 clusters shift a bit towards deeper idle
> but the overall miss rate is lower across the board which is perfectly
> fine.
>
> TLDR:
> Mostly no change for a busy workload, no change + better power for a UI
> one. The patches make sense to me & the results look all right so no big
> problems at this stage. I'll do more testing (including the RFC you sent
> out a moment ago) over the next few days and send those out as well.
>
> Short of bumping into any other problems along the way, feel free to
> grab this if you'd like:
> Reviewed-and-tested-by: Kajetan Puchalski <kajetan.puchalski@....com>

Thank you!