linux-kernel - Re: [PATCH v4 0/7] sched/fair: improve scan efficiency of SIS

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <e834147d-ff5a-9212-5458-3ba91475c21d@bytedance.com>
Date:   Mon, 15 Aug 2022 21:59:40 +0800
From:   Abel Wu <wuyun.abel@...edance.com>
To:     K Prateek Nayak <kprateek.nayak@....com>,
        Peter Zijlstra <peterz@...radead.org>,
        Mel Gorman <mgorman@...e.de>,
        Vincent Guittot <vincent.guittot@...aro.org>
Cc:     Josh Don <joshdon@...gle.com>, Chen Yu <yu.c.chen@...el.com>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        "Gautham R . Shenoy" <gautham.shenoy@....com>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 0/7] sched/fair: improve scan efficiency of SIS

Hi K Prateek, thanks for your test and sorry for the late reply..

On 7/18/22 7:00 PM, K Prateek Nayak Wrote:
> Hello Abel,
> 
> We've tested the patch on a dual socket Zen3 System (2 x 64C/128T).
> 
> tl;dr
> 
> - There is a noticeable regression for Hackbench with the system
>    configured in NPS4 mode. This regression is more noticeable
>    with SIS_UTIL enabled and not as severe with SIS_PROP.
>    This regression is surprising given the patch should have
>    improved SIS Efficiency in case of fully loaded system and is
>    consistently reproducible across multiple runs and reboots.

The regression seems unexpected, I will try to reproduce with my
Intel server. While staring at the code, I found something may be
relative to the issue:

  - The cpumask_and() in select_idle_cpu() is before SIS_UTIL which
    could bail out early. So when SIS filter is enabled, lots of
    useless efforts could be made if nr_idle_scan==0 (e.g. 16groups).
    While the SIS_PROP case is different, the efforts done by the
    filter won't be all in vain, that's probably the reason why the
    regression under SIS_UTIL is more noticeable. I am working on a
    patch to optimize this.

  - If nr_idle_scan == 0 then select_idle_cpu() will bail out early,
    so it's pointless to update SIS filter which may further burden
    the overhead together with the above issue. This will be fixed
    in next version.

I will rework the whole patchset to fit the new SIS_UTIL feature.

> 
> - Apart from the above anomaly, the results look positive overall
>    with the patched kernel behaving as well as, or better than the tip.

Cheers!

> 
> [..snip..]
> 
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Hackbench - 15 runs statistics
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> o NPS 4 - 16 groups (SIS_UTIL)
> 
> - tip
> 
> Min           : 7.35
> Max           : 12.66
> Median        : 10.60
> AMean         : 10.00
> GMean         : 9.82
> HMean         : 9.64
> AMean Stddev  : 1.88
> AMean CoefVar : 18.85 pct
> 
> - SIS_Eff
> 
> Min           : 12.32
> Max           : 18.92
> Median        : 13.82
> AMean         : 14.96	(-49.60 pct)
> GMean         : 14.80
> HMean         : 14.66
> AMean Stddev  : 2.25
> AMean CoefVar : 15.01 pct
> 
> o NPS 4 - 16 groups (SIS_PROP)
> 
> - tip
> 
> Min           : 7.04
> Max           : 8.22
> Median        : 7.49
> AMean         : 7.52
> GMean         : 7.52
> HMean         : 7.51
> AMean Stddev  : 0.29
> AMean CoefVar : 3.88 pct
> 
> - SIS_Eff
> 
> Min           : 7.04
> Max           : 9.78
> Median        : 8.16
> AMean         : 8.42	(-11.06 pct)
> GMean         : 8.39
> HMean         : 8.36
> AMean Stddev  : 0.78
> AMean CoefVar : 9.23 pct
> 
> The Hackbench regression is much more noticeable with SIS_UTIL
> enabled but only when the test machine is running in NPS4 mode.
> It is not obvious why this is happening given the patch series
> aims at improving SIS Efficiency.

The result seems to get some kind of connection with the LLC size.
I need some time to figure it out.

> 
> It would be great if you can test the series with SIS_UTIL
> enabled and SIS_PROP disabled to see if it effects any benchmark
> behavior given SIS_UTIL is the default SIS logic currently on
> the tip.

Yes, I will.

Thanks & Best Regards,
Abel