linux-kernel - [PATCH v2 0/4] sched/eevdf: Optimize reweight and pick

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20231115033647.80785-1-wuyun.abel@bytedance.com>
Date:   Wed, 15 Nov 2023 11:36:43 +0800
From:   Abel Wu <wuyun.abel@...edance.com>
To:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Valentin Schneider <valentin.schneider@....com>
Cc:     Barry Song <21cnbao@...il.com>,
        Benjamin Segall <bsegall@...gle.com>,
        Chen Yu <yu.c.chen@...el.com>,
        Daniel Jordan <daniel.m.jordan@...cle.com>,
        "Gautham R . Shenoy" <gautham.shenoy@....com>,
        Joel Fernandes <joel@...lfernandes.org>,
        K Prateek Nayak <kprateek.nayak@....com>,
        Mike Galbraith <efault@....de>,
        Qais Yousef <qyousef@...alina.io>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Yicong Yang <yangyicong@...wei.com>,
        Youssef Esmat <youssefesmat@...omium.org>,
        linux-kernel@...r.kernel.org, Abel Wu <wuyun.abel@...edance.com>
Subject: [PATCH v2 0/4] sched/eevdf: Optimize reweight and pick

v1 -> v2:
  - Removed unrelated hunks in 2nd patch.

---------

This patchset makes these contributions:

  [1/4] Fixes the problem that vruntime doesn't get adjusted
        when reweight at !0-lag point.

  [2/4] Optimize out the fallback search on @best_left which
        doubles the cost in worst case.

  [3/4] Enable O(1) fastpath picking based on deadline-sorted
        leftmost-cached rbtree.

  [4/4] Statistics for patch 3, not intended to upstream.

All the benchmarks are done inside a normal cpu cgroup in a clean
environment with cpu turbo disabled, on a Dual-CPU Intel Xeon(R)
Platinum 8260 with 2 NUMA nodes each of which has 24C/48T.

	p0: baseline, tip/master 1187c0b3a6c2
	p1: p0 + patch(1)
	p3: p0 + patch(1~3)

hackbench
=========
case            load    	  p0% (std%)	   p1% ( std%)	   p3% ( std%)
process-pipe    group-1 	 1.00 (  2.49)	 -1.73 (  2.64)	 -3.77 (  0.91)
process-pipe    group-2 	 1.00 (  5.23)	 +5.51 (  2.32)	 -3.41 (  4.28)
process-pipe    group-4 	 1.00 (  5.30)	 +3.53 (  5.46)	 +6.51 (  1.44)
process-pipe    group-8 	 1.00 (  1.36)	 -1.85 (  2.22)	 -3.57 (  1.06)
process-sockets group-1 	 1.00 (  2.29)	 -2.39 (  2.66)	 -2.39 (  1.86)
process-sockets group-2 	 1.00 (  3.46)	 +0.46 (  1.85)	 +1.19 (  2.08)
process-sockets group-4 	 1.00 (  1.43)	 -1.98 (  2.78)	 +4.52 (  8.68)
process-sockets group-8 	 1.00 (  0.95)	 -1.60 (  0.94)	 +2.78 (  2.14)
threads-pipe    group-1 	 1.00 (  1.92)	 +5.33 (  1.54)	 +3.47 (  1.09)
threads-pipe    group-2 	 1.00 (  0.64)	 +0.51 (  2.31)	 +2.91 (  0.43)
threads-pipe    group-4 	 1.00 (  3.03)	 -2.91 (  2.31)	 +1.83 (  1.65)
threads-pipe    group-8 	 1.00 (  2.55)	 +1.89 (  3.04)	 -1.29 (  2.32)
threads-sockets group-1 	 1.00 (  0.71)	 +0.83 (  0.52)	 -0.42 (  0.52)
threads-sockets group-2 	 1.00 (  2.48)	 -2.52 (  1.20)	 -3.27 (  0.59)
threads-sockets group-4 	 1.00 (  1.96)	 +2.67 (  2.34)	 +3.74 (  1.18)
threads-sockets group-8 	 1.00 (  1.09)	 -2.30 (  0.51)	 +3.07 (  0.62)

netperf
=======
case            load    	  p0% (std%)	   p1% ( std%)	   p3% ( std%)
TCP_RR          thread-24	 1.00 (  2.48)	 -2.15 (  2.38)	 +0.17 (  1.95)
TCP_RR          thread-48	 1.00 (  0.73)	 -1.59 (  0.51)	 +0.47 (  0.93)
TCP_RR          thread-72	 1.00 (  1.04)	 -1.26 (  1.03)	 -0.09 (  1.13)
TCP_RR          thread-96	 1.00 ( 29.36)	+70.41 ( 14.86)	+17.88 ( 37.24)
TCP_RR          thread-192	 1.00 ( 28.29)	 -1.30 ( 34.03)	 -2.00 ( 30.63)
TCP_STREAM      thread-24	 1.00 (  1.57)	 +0.38 (  1.90)	 +0.20 (  1.72)
TCP_STREAM      thread-48	 1.00 (  0.08)	 -0.29 (  0.07)	 +0.15 (  0.12)
TCP_STREAM      thread-72	 1.00 (  0.01)	 -0.00 (  0.00)	 +0.00 (  0.00)
TCP_STREAM      thread-96	 1.00 (  0.76)	 +0.16 (  0.65)	 +0.30 (  0.47)
TCP_STREAM      thread-192	 1.00 (  0.65)	 +0.23 (  0.46)	 +0.25 (  0.49)
UDP_RR          thread-24	 1.00 (  1.74)	 -1.26 (  2.41)	 +0.81 (  3.02)
UDP_RR          thread-48	 1.00 (  0.56)	 -0.40 ( 16.72)	 -0.98 (  0.36)
UDP_RR          thread-72	 1.00 (  0.84)	 -0.70 (  0.66)	 -0.27 (  0.88)
UDP_RR          thread-96	 1.00 (  1.24)	 -0.44 (  1.01)	 -0.99 (  8.99)
UDP_RR          thread-192	 1.00 ( 28.02)	 -0.42 ( 31.59)	 -1.80 ( 26.23)
UDP_STREAM      thread-24	 1.00 (100.05)	 +0.31 (100.04)	 +0.32 (100.06)
UDP_STREAM      thread-48	 1.00 (104.35)	 +1.22 (105.14)	 +1.65 (104.10)
UDP_STREAM      thread-72	 1.00 (100.69)	 +1.28 (100.63)	 -0.17 (100.49)
UDP_STREAM      thread-96	 1.00 ( 99.63)	 +0.33 ( 99.51)	 -0.25 ( 99.53)
UDP_STREAM      thread-192	 1.00 (100.57)	 +2.00 (107.01)	 -1.21 ( 99.51)

tbench
======
case            load    	  p0% (std%)	   p1% ( std%)	   p3% ( std%)
loopback        thread-24	 1.00 (  0.49)	 -1.47 (  0.94)	 +0.08 (  0.75)
loopback        thread-48	 1.00 (  0.42)	 -0.04 (  0.53)	 -0.06 (  0.34)
loopback        thread-72	 1.00 (  7.10)	 -3.33 (  2.98)	 -5.06 (  0.31)
loopback        thread-96	 1.00 (  0.80)	 +2.65 (  0.80)	 -0.68 (  1.30)
loopback        thread-192	 1.00 (  1.24)	 +1.21 (  0.73)	 -1.78 (  0.22)

schbench
========
case            load    	  p0% (std%)	   p1% ( std%)	   p3% ( std%)
normal          mthread-1	 1.00 (  5.83)	 -2.83 (  1.46)	 +1.24 (  2.51)
normal          mthread-2	 1.00 (  4.45)	 +8.94 (  7.81)	+14.24 (  7.44)
normal          mthread-4	 1.00 (  2.73)	 +2.53 (  4.31)	+12.44 (  5.99)
normal          mthread-8	 1.00 (  0.15)	 +0.21 (  0.13)	 -0.34 (  0.11)

Seems no obvious complain from these benchmarks.
Comments are appreciated! Thanks!

Abel Wu (4):
  sched/eevdf: Fix vruntime adjustment on reweight
  sched/eevdf: Sort the rbtree by virtual deadline
  sched/eevdf: O(1) fastpath for task selection
  sched/stats: Add statistics for pick_eevdf()

 include/linux/sched.h |   2 +-
 kernel/sched/debug.c  |  11 +-
 kernel/sched/fair.c   | 341 ++++++++++++++++++++++++++----------------
 kernel/sched/sched.h  |   6 +
 kernel/sched/stats.c  |   6 +-
 5 files changed, 233 insertions(+), 133 deletions(-)

-- 
2.37.3