linux-kernel - Re: [PATCH v6 0/4] sched/fair: Improve scan efficiency of SIS

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <02890ec6-b7dc-e0cd-4797-d5343d42361c@bytedance.com>
Date:   Thu, 24 Nov 2022 11:50:13 +0800
From:   Abel Wu <wuyun.abel@...edance.com>
To:     K Prateek Nayak <kprateek.nayak@....com>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>, Mel Gorman <mgorman@...e.de>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Valentin Schneider <valentin.schneider@....com>
Cc:     Josh Don <joshdon@...gle.com>, Chen Yu <yu.c.chen@...el.com>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        "Gautham R . Shenoy" <gautham.shenoy@....com>,
        Aubrey Li <aubrey.li@...el.com>,
        Qais Yousef <qais.yousef@....com>,
        Juri Lelli <juri.lelli@...hat.com>,
        Rik van Riel <riel@...riel.com>,
        Yicong Yang <yangyicong@...wei.com>,
        Barry Song <21cnbao@...il.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v6 0/4] sched/fair: Improve scan efficiency of SIS

Hi Prateek, thanks again for your detailed test!

On 11/22/22 7:28 PM, K Prateek Nayak wrote:
> Hello Abel,
> 
> Following are the results for hackbench with larger number of
> groups, ycsb-mongodb, Spec-JBB, and unixbench. Apart for
> a regression in unixbench spawn in NPS2 and NPS4 mode and
> unixbench syscall in NPs2 mode, everything looks good.
> 
> ...
> 
> -> unixbench-syscall
> 
> o NPS4
> 
> kernel:                             tip                  sis_core
> Min       unixbench-syscall-1    2971799.80 (   0.00%)  2979335.60 (  -0.25%)
> Min       unixbench-syscall-512  7824196.90 (   0.00%)  8155610.20 (  -4.24%)
> Amean     unixbench-syscall-1    2973045.43 (   0.00%)  2982036.13 *  -0.30%*
> Amean     unixbench-syscall-512  7826302.17 (   0.00%)  8173026.57 *  -4.43%*   <-- Regression in syscall for larger worker count
> CoeffVar  unixbench-syscall-1          0.04 (   0.00%)        0.09 (-139.63%)
> CoeffVar  unixbench-syscall-512        0.03 (   0.00%)        0.20 (-701.13%)
> 
> 
> -> unixbench-spawn
> 
> o NPS1
> 
> kernel:                             tip                  sis_core
> Min       unixbench-spawn-1       6536.50 (   0.00%)     6000.30 (  -8.20%)
> Min       unixbench-spawn-512    72571.40 (   0.00%)    70829.60 (  -2.40%)
> Hmean     unixbench-spawn-1       6811.16 (   0.00%)     7016.11 (   3.01%)
> Hmean     unixbench-spawn-512    72801.77 (   0.00%)    71012.03 *  -2.46%*
> CoeffVar  unixbench-spawn-1          3.69 (   0.00%)       13.52 (-266.69%)
> CoeffVar  unixbench-spawn-512        0.27 (   0.00%)        0.22 (  18.25%)
> 
> o NPS2
> 
> kernel:                             tip                  sis_core
> Min       unixbench-spawn-1       7042.20 (   0.00%)     7078.70 (   0.52%)
> Min       unixbench-spawn-512    85571.60 (   0.00%)    77362.60 (  -9.59%)
> Hmean     unixbench-spawn-1       7199.01 (   0.00%)     7276.55 (   1.08%)
> Hmean     unixbench-spawn-512    85717.77 (   0.00%)    77923.73 *  -9.09%*     <-- Regression in spawn test for larger worker count
> CoeffVar  unixbench-spawn-1          3.50 (   0.00%)        3.30 (   5.70%)
> CoeffVar  unixbench-spawn-512        0.20 (   0.00%)        0.82 (-304.88%)
> 
> o NPS4
> 
> kernel:                             tip                  sis_core
> Min       unixbench-spawn-1       7521.90 (   0.00%)     8102.80 (   7.72%)
> Min       unixbench-spawn-512    84245.70 (   0.00%)    73074.50 ( -13.26%)
> Hmean     unixbench-spawn-1       7659.12 (   0.00%)     8645.19 *  12.87%*
> Hmean     unixbench-spawn-512    84908.77 (   0.00%)    73409.49 * -13.54%*     <-- Regression in spawn test for larger worker count
> CoeffVar  unixbench-spawn-1          1.92 (   0.00%)        5.78 (-200.56%)
> CoeffVar  unixbench-spawn-512        0.76 (   0.00%)        0.41 (  46.58%)
> 
> ...
> 
> For unixbench regressions, I do not see anything obvious jump up
> in perf traces captureed with IBS. top shows over 99% utilization
> which would ideally mean there are not many updates to the mask.
> I'll take some more look at the spawn test case and get back to you.

These regressions seems to be common in full parallel tests. I
guess it might be due to over updating the idle cpumask when LLC
is overloaded which is not necessary if SIS_UTIL enabled, but I
need to dig it further. Maybe the rq avg_idle or nr_idle_scan need
to be taken into consideration as well. Thanks for providing these
important information.

> 
> ~~~~~~~~~~~~~
> ~ Hackbench ~
> ~~~~~~~~~~~~~
> 
> $ perf bench sched messaging -p -l 50000 -g <groups>
> 
> o NPS1
> 
> kernel:               tip                     sis_core
> 32-groups:         6.20 (0.00 pct)         5.86 (5.48 pct)
> 64-groups:        16.55 (0.00 pct)        15.21 (8.09 pct)
> 128-groups:       42.57 (0.00 pct)        34.63 (18.65 pct)
> 256-groups:       71.69 (0.00 pct)        67.11 (6.38 pct)
> 512-groups:      108.48 (0.00 pct)       110.23 (-1.61 pct)
> 
> o NPS2
> 
> kernel:                tip                     sis_core
> 32-groups:         6.56 (0.00 pct)         5.60 (14.63 pct)
> 64-groups:        15.74 (0.00 pct)        14.45 (8.19 pct)
> 128-groups:       39.93 (0.00 pct)        35.33 (11.52 pct)
> 256-groups:       74.49 (0.00 pct)        69.65 (6.49 pct)
> 512-groups:      112.22 (0.00 pct)       113.75 (-1.36 pct)
> 
> o NPS4:
> 
> kernel:               tip                     sis_core
> 32-groups:         9.48 (0.00 pct)         5.64 (40.50 pct)
> 64-groups:        15.38 (0.00 pct)        14.13 (8.12 pct)
> 128-groups:       39.93 (0.00 pct)        34.47 (13.67 pct)
> 256-groups:       75.31 (0.00 pct)        67.98 (9.73 pct)
> 512-groups:      115.37 (0.00 pct)       111.15 (3.65 pct)
> 
> Note: Hackbench with 32-groups show run to run variation
> on tip but is more stable with sis_core. Hackbench for
> 64-groups and beyond is stable on both kernels.
> 
The result is consistent with mine except 512-groups which I
didn't test. The 512-groups test may have the same problem
aforementioned.

Thanks & Regards,
	Abel