[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <e834147d-ff5a-9212-5458-3ba91475c21d@bytedance.com>
Date: Mon, 15 Aug 2022 21:59:40 +0800
From: Abel Wu <wuyun.abel@...edance.com>
To: K Prateek Nayak <kprateek.nayak@....com>,
Peter Zijlstra <peterz@...radead.org>,
Mel Gorman <mgorman@...e.de>,
Vincent Guittot <vincent.guittot@...aro.org>
Cc: Josh Don <joshdon@...gle.com>, Chen Yu <yu.c.chen@...el.com>,
Tim Chen <tim.c.chen@...ux.intel.com>,
"Gautham R . Shenoy" <gautham.shenoy@....com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 0/7] sched/fair: improve scan efficiency of SIS
Hi K Prateek, thanks for your test and sorry for the late reply..
On 7/18/22 7:00 PM, K Prateek Nayak Wrote:
> Hello Abel,
>
> We've tested the patch on a dual socket Zen3 System (2 x 64C/128T).
>
> tl;dr
>
> - There is a noticeable regression for Hackbench with the system
> configured in NPS4 mode. This regression is more noticeable
> with SIS_UTIL enabled and not as severe with SIS_PROP.
> This regression is surprising given the patch should have
> improved SIS Efficiency in case of fully loaded system and is
> consistently reproducible across multiple runs and reboots.
The regression seems unexpected, I will try to reproduce with my
Intel server. While staring at the code, I found something may be
relative to the issue:
- The cpumask_and() in select_idle_cpu() is before SIS_UTIL which
could bail out early. So when SIS filter is enabled, lots of
useless efforts could be made if nr_idle_scan==0 (e.g. 16groups).
While the SIS_PROP case is different, the efforts done by the
filter won't be all in vain, that's probably the reason why the
regression under SIS_UTIL is more noticeable. I am working on a
patch to optimize this.
- If nr_idle_scan == 0 then select_idle_cpu() will bail out early,
so it's pointless to update SIS filter which may further burden
the overhead together with the above issue. This will be fixed
in next version.
I will rework the whole patchset to fit the new SIS_UTIL feature.
>
> - Apart from the above anomaly, the results look positive overall
> with the patched kernel behaving as well as, or better than the tip.
Cheers!
>
> [..snip..]
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Hackbench - 15 runs statistics
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> o NPS 4 - 16 groups (SIS_UTIL)
>
> - tip
>
> Min : 7.35
> Max : 12.66
> Median : 10.60
> AMean : 10.00
> GMean : 9.82
> HMean : 9.64
> AMean Stddev : 1.88
> AMean CoefVar : 18.85 pct
>
> - SIS_Eff
>
> Min : 12.32
> Max : 18.92
> Median : 13.82
> AMean : 14.96 (-49.60 pct)
> GMean : 14.80
> HMean : 14.66
> AMean Stddev : 2.25
> AMean CoefVar : 15.01 pct
>
> o NPS 4 - 16 groups (SIS_PROP)
>
> - tip
>
> Min : 7.04
> Max : 8.22
> Median : 7.49
> AMean : 7.52
> GMean : 7.52
> HMean : 7.51
> AMean Stddev : 0.29
> AMean CoefVar : 3.88 pct
>
> - SIS_Eff
>
> Min : 7.04
> Max : 9.78
> Median : 8.16
> AMean : 8.42 (-11.06 pct)
> GMean : 8.39
> HMean : 8.36
> AMean Stddev : 0.78
> AMean CoefVar : 9.23 pct
>
> The Hackbench regression is much more noticeable with SIS_UTIL
> enabled but only when the test machine is running in NPS4 mode.
> It is not obvious why this is happening given the patch series
> aims at improving SIS Efficiency.
The result seems to get some kind of connection with the LLC size.
I need some time to figure it out.
>
> It would be great if you can test the series with SIS_UTIL
> enabled and SIS_PROP disabled to see if it effects any benchmark
> behavior given SIS_UTIL is the default SIS logic currently on
> the tip.
Yes, I will.
Thanks & Best Regards,
Abel
Powered by blists - more mailing lists