linux-kernel - Re: [PATCH v3 0/4] sched/fair: Burstable CFS bandwidth controller

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <9FD4A7E9-B545-40AB-A5B5-66DF37991474@linux.alibaba.com>
Date:   Tue, 26 Jan 2021 18:18:59 +0800
From:   changhuaixin <changhuaixin@...ux.alibaba.com>
To:     changhuaixin <changhuaixin@...ux.alibaba.com>
Cc:     bsegall@...gle.com, dietmar.eggemann@....com,
        juri.lelli@...hat.com, khlebnikov@...dex-team.ru,
        linux-kernel@...r.kernel.org, mgorman@...e.de, mingo@...hat.com,
        pauld@...head.com, peterz@...radead.org, pjt@...gle.com,
        rostedt@...dmis.org, shanpeic@...ux.alibaba.com,
        vincent.guittot@...aro.org, xiyou.wangcong@...il.com
Subject: Re: [PATCH v3 0/4] sched/fair: Burstable CFS bandwidth controller



> On Jan 21, 2021, at 7:04 PM, Huaixin Chang <changhuaixin@...ux.alibaba.com> wrote:
> 
> Changelog
> 
> v3:
> 1. Fix another issue reported by test robot.
> 2. Update docs as Randy Dunlap suggested.
> 
> v2:
> 1. Fix an issue reported by test robot.
> 2. Rewriting docs. Appreciate any further suggestions or help.
> 
> The CFS bandwidth controller limits CPU requests of a task group to
> quota during each period. However, parallel workloads might be bursty
> so that they get throttled. And they are latency sensitive at the same
> time so that throttling them is undesired.
> 
> Scaling up period and quota allows greater burst capacity. But it might
> cause longer stuck till next refill. We introduce "burst" to allow
> accumulating unused quota from previous periods, and to be assigned when
> a task group requests more CPU than quota during a specific period. Thus
> allowing CPU time requests as long as the average requested CPU time is
> below quota on the long run. The maximum accumulation is capped by burst
> and is set 0 by default, thus the traditional behaviour remains.
> 
> A huge drop of 99th tail latency from more than 500ms to 27ms is seen for
> real java workloads when using burst. Similar drops are seen when
> testing with schbench too:
> 
> 	echo $$ > /sys/fs/cgroup/cpu/test/cgroup.procs
> 	echo 700000 > /sys/fs/cgroup/cpu/test/cpu.cfs_quota_us
> 	echo 100000 > /sys/fs/cgroup/cpu/test/cpu.cfs_period_us
> 	echo 400000 > /sys/fs/cgroup/cpu/test/cpu.cfs_burst_us
> 
> 	# The average CPU usage is around 500%, which is 200ms CPU time
> 	# every 40ms.
> 	./schbench -m 1 -t 30 -r 60 -c 10000 -R 500
> 
> 	Without burst:
> 
> 	Latency percentiles (usec)
> 	50.0000th: 7
> 	75.0000th: 8
> 	90.0000th: 9
> 	95.0000th: 10
> 	*99.0000th: 933
> 	99.5000th: 981
> 	99.9000th: 3068
> 	min=0, max=20054
> 	rps: 498.31 p95 (usec) 10 p99 (usec) 933 p95/cputime 0.10% p99/cputime 9.33%
> 
> 	With burst:
> 
> 	Latency percentiles (usec)
> 	50.0000th: 7
> 	75.0000th: 8
> 	90.0000th: 9
> 	95.0000th: 9
> 	*99.0000th: 12
> 	99.5000th: 13
> 	99.9000th: 19
> 	min=0, max=406
> 	rps: 498.36 p95 (usec) 9 p99 (usec) 12 p95/cputime 0.09% p99/cputime 0.12%
> 
> How much workloads with benefit from burstable CFS bandwidth control
> depends on how bursty and how latency sensitive they are.
> 
> Previously, Cong Wang and Konstantin Khlebnikov proposed similar
> feature:
> https://lore.kernel.org/lkml/20180522062017.5193-1-xiyou.wangcong@gmail.com/
> https://lore.kernel.org/lkml/157476581065.5793.4518979877345136813.stgit@buzz/
> 
> This time we present more latency statistics and handle overflow while
> accumulating.
> 
> Huaixin Chang (4):
>  sched/fair: Introduce primitives for CFS bandwidth burst
>  sched/fair: Make CFS bandwidth controller burstable
>  sched/fair: Add cfs bandwidth burst statistics
>  sched/fair: Add document for burstable CFS bandwidth control
> 
> Documentation/scheduler/sched-bwc.rst |  49 +++++++++++--
> include/linux/sched/sysctl.h          |   2 +
> kernel/sched/core.c                   | 126 +++++++++++++++++++++++++++++-----
> kernel/sched/fair.c                   |  58 +++++++++++++---
> kernel/sched/sched.h                  |   9 ++-
> kernel/sysctl.c                       |  18 +++++
> 6 files changed, 232 insertions(+), 30 deletions(-)
> 
> -- 
> 2.14.4.44.g2045bb6

Ping, any new comments on this patchset? If there're no other concerns, I think it's ready to be merged?