lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 23 Nov 2022 15:53:27 -0800
From:   "Doug Smythies" <dsmythies@...us.net>
To:     "'Rafael J. Wysocki'" <rafael@...nel.org>,
        "'Zhang Rui'" <rui.zhang@...el.com>
Cc:     <rjw@...ysocki.net>, <daniel.lezcano@...aro.org>,
        <linux-pm@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        "Doug Smythies" <dsmythies@...us.net>
Subject: RE: [RFC PATCH 2/3] cpuidle: ladder: Tune promotion/demotion threshold

On 2022.11.23 09:50 Rafael wrote:
> On Sat, Nov 5, 2022 at 6:40 PM Zhang Rui <rui.zhang@...el.com> wrote:
>>
>> After fixing the bogus comparison between u64 and s64, the ladder
>> governor stops making promotion decisions errornously.
>>
>> However, after this, it is found that the ladder governor demotes much
>> easier than promotes.
>
> "After fixing an error related to using signed and unsigned integers
> in the ladder governor in a previous patch, that governor turns out to
> demote much easier than promote"
>
>> Below is captured using turbostat after a 30 seconds runtime idle,
>>
>> Without previous patch,
>> Busy%   IRQ     POLL    C1      C1E     C3      C6      C7s     C8      C9      C10     CPU%c1  CPU%c3  CPU%c6  CPU%c7  PkgWatt
>> 0.30    2373    0       0       0       4       9       25      122     326     2857    0.36    0.04    0.57    98.73   1.48
>
> Why is the above relevant?
>
>> With previous patch,
>> Busy%   IRQ     POLL    C1      C1E     C3      C6      C7s     C8      C9      C10     CPU%c1  CPU%c3  CPU%c6  CPU%c7  PkgWatt
>> 0.42    3071    0       771     838     447     327     336     382     299     344     34.18   16.21   17.69   31.51   2.00
>>
>> And this is caused by the imbalanced PROMOTION_COUNT/DEMOTION_COUNT.
>
> I would explain why/how the imbalanced PROMOTION_COUNT/DEMOTION_COUNT
> imbalance causes this.
>
> I guess more residency in the deeper idle state is expected?  Or desired??
>
>> With this patch,
>> Busy%   IRQ     POLL    C1      C1E     C3      C6      C7s     C8      C9      C10     CPU%c1  CPU%c3  CPU%c6  CPU%c7  PkgWatt
>> 0.39    2436    0       1       72      177     51      194     243     799     1883    0.50    0.32    0.35    98.45   1.53
>>
>> Note that this is an experimental patch to illustrate the problem,
>> and it is checked with idle scenario only for now.
>> I will try to evaluate with more scenarios, and if someone can help
>> evaluate with more scenarios at the same time and provide data for the
>> benefit with different PROMOTION_COUNT/DEMOTION_COUNT values, that
>> would be great.
>
> So yes, this requires more work.
>
> Overall, I think that you are concerned that the previous change might
> be regarded as a regression and are trying to compensate for it with a
> PROMOTION_COUNT/DEMOTION_COUNT change.
>
> I'm not sure I can agree with that approach, because the shallower
> idle states might be preferred by the original ladder design
> intentionally, for performance reasons.

Hi All,

Because I was continuing to test the teo governor with
the util patch version 4, it was fairly easy for me to test
this patch set also. However, I have had difficulties having
enough time to write up my results.

The best improvement was for a slow speed ping-pong
(I did 3 speeds, fast, medium, and slow)
2 pairs of ping pongs, not forced CPU affinity,
schedutil CPU scaling governor,
intel_cpufreq CPU scaling driver,
HWP disabled.

The menu governor was considered the master reference:

Old ladder was 44% slower.
New ladder was 5.9% slower.

Just for reference:
Old teo was 29% slower.
teo util V4 was 13% slower.  

The worst degradation was for a fast speed ping-pong
2 pairs of ping pongs, not forced CPU affinity,
schedutil CPU scaling governor,
intel_cpufreq CPU scaling driver,
HWP disabled.

The menu governor was considered the master reference:

Old ladder was 64% slower.
New ladder was 71% slower.

Interestingly, the old ladder governor outperformed
the menu governor on the slow 2 pair ping-pong
with the performance governor:

Old ladder was 0.56% faster.
New ladder was 0.81% slower.

Disclaimer: Test results using the schedutil
CPU scaling governor are noisy, with
questionable repeatability.

I'll try to get all the test results written up soon.

... Doug


Powered by blists - more mailing lists