[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <02890ec6-b7dc-e0cd-4797-d5343d42361c@bytedance.com>
Date: Thu, 24 Nov 2022 11:50:13 +0800
From: Abel Wu <wuyun.abel@...edance.com>
To: K Prateek Nayak <kprateek.nayak@....com>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...nel.org>, Mel Gorman <mgorman@...e.de>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Valentin Schneider <valentin.schneider@....com>
Cc: Josh Don <joshdon@...gle.com>, Chen Yu <yu.c.chen@...el.com>,
Tim Chen <tim.c.chen@...ux.intel.com>,
"Gautham R . Shenoy" <gautham.shenoy@....com>,
Aubrey Li <aubrey.li@...el.com>,
Qais Yousef <qais.yousef@....com>,
Juri Lelli <juri.lelli@...hat.com>,
Rik van Riel <riel@...riel.com>,
Yicong Yang <yangyicong@...wei.com>,
Barry Song <21cnbao@...il.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v6 0/4] sched/fair: Improve scan efficiency of SIS
Hi Prateek, thanks again for your detailed test!
On 11/22/22 7:28 PM, K Prateek Nayak wrote:
> Hello Abel,
>
> Following are the results for hackbench with larger number of
> groups, ycsb-mongodb, Spec-JBB, and unixbench. Apart for
> a regression in unixbench spawn in NPS2 and NPS4 mode and
> unixbench syscall in NPs2 mode, everything looks good.
>
> ...
>
> -> unixbench-syscall
>
> o NPS4
>
> kernel: tip sis_core
> Min unixbench-syscall-1 2971799.80 ( 0.00%) 2979335.60 ( -0.25%)
> Min unixbench-syscall-512 7824196.90 ( 0.00%) 8155610.20 ( -4.24%)
> Amean unixbench-syscall-1 2973045.43 ( 0.00%) 2982036.13 * -0.30%*
> Amean unixbench-syscall-512 7826302.17 ( 0.00%) 8173026.57 * -4.43%* <-- Regression in syscall for larger worker count
> CoeffVar unixbench-syscall-1 0.04 ( 0.00%) 0.09 (-139.63%)
> CoeffVar unixbench-syscall-512 0.03 ( 0.00%) 0.20 (-701.13%)
>
>
> -> unixbench-spawn
>
> o NPS1
>
> kernel: tip sis_core
> Min unixbench-spawn-1 6536.50 ( 0.00%) 6000.30 ( -8.20%)
> Min unixbench-spawn-512 72571.40 ( 0.00%) 70829.60 ( -2.40%)
> Hmean unixbench-spawn-1 6811.16 ( 0.00%) 7016.11 ( 3.01%)
> Hmean unixbench-spawn-512 72801.77 ( 0.00%) 71012.03 * -2.46%*
> CoeffVar unixbench-spawn-1 3.69 ( 0.00%) 13.52 (-266.69%)
> CoeffVar unixbench-spawn-512 0.27 ( 0.00%) 0.22 ( 18.25%)
>
> o NPS2
>
> kernel: tip sis_core
> Min unixbench-spawn-1 7042.20 ( 0.00%) 7078.70 ( 0.52%)
> Min unixbench-spawn-512 85571.60 ( 0.00%) 77362.60 ( -9.59%)
> Hmean unixbench-spawn-1 7199.01 ( 0.00%) 7276.55 ( 1.08%)
> Hmean unixbench-spawn-512 85717.77 ( 0.00%) 77923.73 * -9.09%* <-- Regression in spawn test for larger worker count
> CoeffVar unixbench-spawn-1 3.50 ( 0.00%) 3.30 ( 5.70%)
> CoeffVar unixbench-spawn-512 0.20 ( 0.00%) 0.82 (-304.88%)
>
> o NPS4
>
> kernel: tip sis_core
> Min unixbench-spawn-1 7521.90 ( 0.00%) 8102.80 ( 7.72%)
> Min unixbench-spawn-512 84245.70 ( 0.00%) 73074.50 ( -13.26%)
> Hmean unixbench-spawn-1 7659.12 ( 0.00%) 8645.19 * 12.87%*
> Hmean unixbench-spawn-512 84908.77 ( 0.00%) 73409.49 * -13.54%* <-- Regression in spawn test for larger worker count
> CoeffVar unixbench-spawn-1 1.92 ( 0.00%) 5.78 (-200.56%)
> CoeffVar unixbench-spawn-512 0.76 ( 0.00%) 0.41 ( 46.58%)
>
> ...
>
> For unixbench regressions, I do not see anything obvious jump up
> in perf traces captureed with IBS. top shows over 99% utilization
> which would ideally mean there are not many updates to the mask.
> I'll take some more look at the spawn test case and get back to you.
These regressions seems to be common in full parallel tests. I
guess it might be due to over updating the idle cpumask when LLC
is overloaded which is not necessary if SIS_UTIL enabled, but I
need to dig it further. Maybe the rq avg_idle or nr_idle_scan need
to be taken into consideration as well. Thanks for providing these
important information.
>
> ~~~~~~~~~~~~~
> ~ Hackbench ~
> ~~~~~~~~~~~~~
>
> $ perf bench sched messaging -p -l 50000 -g <groups>
>
> o NPS1
>
> kernel: tip sis_core
> 32-groups: 6.20 (0.00 pct) 5.86 (5.48 pct)
> 64-groups: 16.55 (0.00 pct) 15.21 (8.09 pct)
> 128-groups: 42.57 (0.00 pct) 34.63 (18.65 pct)
> 256-groups: 71.69 (0.00 pct) 67.11 (6.38 pct)
> 512-groups: 108.48 (0.00 pct) 110.23 (-1.61 pct)
>
> o NPS2
>
> kernel: tip sis_core
> 32-groups: 6.56 (0.00 pct) 5.60 (14.63 pct)
> 64-groups: 15.74 (0.00 pct) 14.45 (8.19 pct)
> 128-groups: 39.93 (0.00 pct) 35.33 (11.52 pct)
> 256-groups: 74.49 (0.00 pct) 69.65 (6.49 pct)
> 512-groups: 112.22 (0.00 pct) 113.75 (-1.36 pct)
>
> o NPS4:
>
> kernel: tip sis_core
> 32-groups: 9.48 (0.00 pct) 5.64 (40.50 pct)
> 64-groups: 15.38 (0.00 pct) 14.13 (8.12 pct)
> 128-groups: 39.93 (0.00 pct) 34.47 (13.67 pct)
> 256-groups: 75.31 (0.00 pct) 67.98 (9.73 pct)
> 512-groups: 115.37 (0.00 pct) 111.15 (3.65 pct)
>
> Note: Hackbench with 32-groups show run to run variation
> on tip but is more stable with sis_core. Hackbench for
> 64-groups and beyond is stable on both kernels.
>
The result is consistent with mine except 512-groups which I
didn't test. The 512-groups test may have the same problem
aforementioned.
Thanks & Regards,
Abel
Powered by blists - more mailing lists