lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 24 Feb 2022 11:19:38 +0800
From:   Abel Wu <wuyun.abel@...edance.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Ben Segall <bsegall@...gle.com>,
        Juri Lelli <juri.lelli@...hat.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Mel Gorman <mgorman@...e.de>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Ingo Molnar <mingo@...hat.com>, linux-kernel@...r.kernel.org,
        Abel Wu <wuyun.abel@...edance.com>
Subject: Re: [RFC PATCH 0/5] introduce sched-idle balancing

Ping :)

On 2/17/22 11:43 PM, Abel Wu Wrote:
> Current load balancing is mainly based on cpu capacity
> and task util, which makes sense in the POV of overall
> throughput. While there still might be some improvement
> can be done by reducing number of overloaded cfs rqs if
> sched-idle or idle rq exists.
> 
> An CFS runqueue is considered overloaded when there are
> more than one pullable non-idle tasks on it (since sched-
> idle cpus are treated as idle cpus). And idle tasks are
> counted towards rq->cfs.idle_h_nr_running, that is either
> assigned SCHED_IDLE policy or placed under idle cgroups.
> 
> The overloaded cfs rqs can cause performance issues to
> both task types:
> 
>    - for latency critical tasks like SCHED_NORMAL,
>      time of waiting in the rq will increase and
>      result in higher pct99 latency, and
> 
>    - batch tasks may not be able to make full use
>      of cpu capacity if sched-idle rq exists, thus
>      presents poorer throughput.
> 
> So in short, the goal of the sched-idle balancing is to
> let the *non-idle tasks* make full use of cpu resources.
> To achieve that, we mainly do two things:
> 
>    - pull non-idle tasks for sched-idle or idle rqs
>      from the overloaded ones, and
> 
>    - prevent pulling the last non-idle task in an rq
> 
> The mask of overloaded cpus is updated in periodic tick
> and the idle path at the LLC domain basis. This cpumask
> will also be used in SIS as a filter, improving idle cpu
> searching.
> 
> Tests are done in an Intel Xeon E5-2650 v4 server with
> 2 NUMA nodes each of which has 12 cores, and with SMT2
> enabled, so 48 CPUs in total. Test results are listed
> as follows.
> 
>    - we used perf messaging test to test throughput
>      at different load (groups).
> 
>        perf bench sched messaging -g [N] -l 40000
> 
> 	N	w/o	w/	diff
> 	1	2.897	2.834	-2.17%
> 	3	5.156	4.904	-4.89%
> 	5	7.850	7.617	-2.97%
> 	10	15.140	14.574	-3.74%
> 	20	29.387	27.602	-6.07%
> 
>      the result shows approximate 2~6% improvement.
> 
>    - and schbench to test latency performance in two
>      scenarios: quiet and noisy. In quiet test, we
>      run schbench in a normal cpu cgroup in a quiet
>      system, while the noisy test additionally runs
>      perf messaging workload inside an idle cgroup
>      as nosie.
> 
>        schbench -m 2 -t 24 -i 60 -r 60
>        perf bench sched messaging -g 1 -l 4000000
> 
> 	[quiet]
> 			w/o	w/
> 	50.0th		31	31
> 	75.0th		45	45
> 	90.0th		55	55
> 	95.0th		62	61
> 	*99.0th		85	86
> 	99.5th		565	318
> 	99.9th		11536	10992
> 	max		13029	13067
> 
> 	[nosiy]
> 			w/o	w/
> 	50.0th		34	32
> 	75.0th		48	45
> 	90.0th		58	55
> 	95.0th		65	61
> 	*99.0th		2364	208
> 	99.5th		6696	2068
> 	99.9th		12688	8816
> 	max		15209	14191
> 
>      it can be seen that the quiet test results are
>      quite similar, but the p99 latency is greatly
>      improved in the nosiy test.
> 
> Comments and tests are appreciated!
> 
> Abel Wu (5):
>    sched/fair: record overloaded cpus
>    sched/fair: introduce sched-idle balance
>    sched/fair: add stats for sched-idle balancing
>    sched/fair: filter out overloaded cpus in sis
>    sched/fair: favor cpu capacity for idle tasks
> 
>   include/linux/sched/idle.h     |   1 +
>   include/linux/sched/topology.h |  15 ++++
>   kernel/sched/core.c            |   1 +
>   kernel/sched/fair.c            | 187 ++++++++++++++++++++++++++++++++++++++++-
>   kernel/sched/sched.h           |   6 ++
>   kernel/sched/stats.c           |   5 +-
>   kernel/sched/topology.c        |   4 +-
>   7 files changed, 215 insertions(+), 4 deletions(-)
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ