linux-kernel - Re: [PATCH v5 2/5] sched/fair: Limited scan for idle cores when overloaded

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 9 Sep 2022 17:29:58 +0800
From:   Chen Yu <yu.c.chen@...el.com>
To:     Abel Wu <wuyun.abel@...edance.com>
CC:     Peter Zijlstra <peterz@...radead.org>,
        Mel Gorman <mgorman@...e.de>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Josh Don <joshdon@...gle.com>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        K Prateek Nayak <kprateek.nayak@....com>,
        "Gautham R . Shenoy" <gautham.shenoy@....com>,
        <linux-kernel@...r.kernel.org>,
        Mel Gorman <mgorman@...hsingularity.net>
Subject: Re: [PATCH v5 2/5] sched/fair: Limited scan for idle cores when
 overloaded

On 2022-09-09 at 13:53:01 +0800, Abel Wu wrote:
> The has_idle_cores hint could be misleading due to some kind of
> rapid idling workloads, especially when LLC is overloaded. If this
> is the case, then there will be some full scan cost incurred that
> often fails to find a core.
> 
> So limit the scan depth for idle cores in such case to make a
> speculative inspection at a reasonable cost.
> 
> Benchmark
> =========
> 
> Tests are done in a dual socket (2 x 24C/48T) machine modeled Intel
> Xeon(R) Platinum 8260, with SNC configuration:
> 
> 	SNC on:  4 NUMA nodes each of which has 12C/24T
> 	SNC off: 2 NUMA nodes each of which has 24C/48T
> 
> All of the benchmarks are done inside a normal cpu cgroup in a clean
> environment with cpu turbo disabled.
> 
> Based on tip sched/core 0fba527e959d (v5.19.0) plus previous patches
> of this series.
> 
> Results
> =======
> 
> hackbench-process-pipes
>                          unpatched		  patched
> (SNC on)
> Amean     1        0.4470 (   0.00%)      0.4557 (  -1.94%)
> Amean     4        0.5947 (   0.00%)      0.6033 (  -1.46%)
> Amean     7        0.7450 (   0.00%)      0.7627 (  -2.37%)
> Amean     12       1.1053 (   0.00%)      1.0653 (   3.62%)
> Amean     21       1.9420 (   0.00%)      2.0283 *  -4.45%*
> Amean     30       2.9267 (   0.00%)      2.9670 (  -1.38%)
> Amean     48       4.7027 (   0.00%)      4.6863 (   0.35%)
> Amean     79       7.7097 (   0.00%)      7.9443 *  -3.04%*
> Amean     110     10.0680 (   0.00%)     10.2393 (  -1.70%)
> Amean     141     12.5450 (   0.00%)     12.6343 (  -0.71%)
> Amean     172     15.0297 (   0.00%)     14.9957 (   0.23%)
> Amean     203     16.8827 (   0.00%)     16.9133 (  -0.18%)
> Amean     234     19.1183 (   0.00%)     19.2433 (  -0.65%)
> Amean     265     20.9893 (   0.00%)     21.6917 (  -3.35%)
> Amean     296     23.3920 (   0.00%)     23.8743 (  -2.06%)
> (SNC off)
> Amean     1        0.2717 (   0.00%)      0.3143 ( -15.71%)
> Amean     4        0.6257 (   0.00%)      0.6070 (   2.98%)
> Amean     7        0.7740 (   0.00%)      0.7960 (  -2.84%)
> Amean     12       1.2410 (   0.00%)      1.1947 (   3.73%)
> Amean     21       2.6410 (   0.00%)      2.4837 (   5.96%)
> Amean     30       3.7620 (   0.00%)      3.4577 (   8.09%)
> Amean     48       6.7757 (   0.00%)      5.5227 *  18.49%*
> Amean     79       8.8827 (   0.00%)      9.2933 (  -4.62%)
> Amean     110     11.0583 (   0.00%)     11.0443 (   0.13%)
> Amean     141     13.3387 (   0.00%)     13.1360 (   1.52%)
> Amean     172     15.9583 (   0.00%)     15.7770 (   1.14%)
> Amean     203     17.8757 (   0.00%)     17.9557 (  -0.45%)
> Amean     234     20.0543 (   0.00%)     20.4373 *  -1.91%*
> Amean     265     22.6643 (   0.00%)     23.6053 *  -4.15%*
> Amean     296     25.6677 (   0.00%)     25.6803 (  -0.05%)
> 
> Run to run variations is large in the 1 group test, so can be safely
> ignored.
> 
> With limited scan for idle cores when the LLC is overloaded, a slight
> regression can be seen on the smaller LLC machine. It is because the
> cost of full scan on these LLCs is much smaller than the machines with
> bigger LLCs. So when comes to the SNC off case, the limited scan can
> provide obvious benefit especially when the frequency of such scan is
> relatively high, e.g. <48 groups.
> 
> It's not a universal win, but considering the LLCs are getting bigger
> nowadays, we should be careful on the scan depth and limited scan on
> certain circumstance is indeed necessary.
> 
> tbench4 Throughput
>                          unpatched		  patched
> (SNC on)
> Hmean     1        309.43 (   0.00%)      301.54 *  -2.55%*
> Hmean     2        613.92 (   0.00%)      607.81 *  -0.99%*
> Hmean     4       1227.84 (   0.00%)     1210.64 *  -1.40%*
> Hmean     8       2379.04 (   0.00%)     2381.73 *   0.11%*
> Hmean     16      4634.66 (   0.00%)     4601.21 *  -0.72%*
> Hmean     32      7592.12 (   0.00%)     7626.84 *   0.46%*
> Hmean     64      9241.11 (   0.00%)     9251.51 *   0.11%*
> Hmean     128    17870.37 (   0.00%)    20620.98 *  15.39%*
> Hmean     256    19370.92 (   0.00%)    20406.51 *   5.35%*
> Hmean     384    19413.92 (   0.00%)    20312.97 *   4.63%*
> (SNC off)
> Hmean     1        287.90 (   0.00%)      292.37 *   1.55%*
> Hmean     2        575.52 (   0.00%)      583.29 *   1.35%*
> Hmean     4       1137.94 (   0.00%)     1155.83 *   1.57%*
> Hmean     8       2250.42 (   0.00%)     2297.63 *   2.10%*
> Hmean     16      4363.41 (   0.00%)     4562.44 *   4.56%*
> Hmean     32      7338.06 (   0.00%)     7425.69 *   1.19%*
> Hmean     64      8914.66 (   0.00%)     9021.77 *   1.20%*
> Hmean     128    19978.59 (   0.00%)    20257.76 *   1.40%*
> Hmean     256    20057.49 (   0.00%)    20043.54 *  -0.07%*
> Hmean     384    19846.74 (   0.00%)    19528.03 *  -1.61%*
> 
> Conclusion
> ==========
> 
> Limited scan for idle cores when LLC is overloaded is almost neutral
> compared to full scan given smaller LLCs, but is an obvious win at
> the bigger ones which are future-oriented.
> 
> Suggested-by: Mel Gorman <mgorman@...hsingularity.net>
> Signed-off-by: Abel Wu <wuyun.abel@...edance.com>
> ---
>  kernel/sched/fair.c | 26 +++++++++++++++++++++-----
>  1 file changed, 21 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 5af9bf246274..7abe188a1533 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6437,26 +6437,42 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool
>  		time = cpu_clock(this);
>  	}
>  
> -	if (sched_feat(SIS_UTIL) && !has_idle_core) {
> +	if (sched_feat(SIS_UTIL)) {
[1/5] patch added !has_idle_core, but this patch removes the check.
I'm trying to figure out the reason. Is it to better illustrating the
benchmark difference?

thanks,
Chenyu