[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5effb4de-b3d1-cffe-938e-4bdd1cc64b44@efficios.com>
Date: Fri, 25 Aug 2023 10:03:54 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Swapnil Sapkal <Swapnil.Sapkal@....com>,
Peter Zijlstra <peterz@...radead.org>
Cc: linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...hat.com>,
Valentin Schneider <vschneid@...hat.com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Juri Lelli <juri.lelli@...hat.com>,
Aaron Lu <aaron.lu@...el.com>,
Julien Desfossez <jdesfossez@...italocean.com>, x86@...nel.org
Subject: Re: [RFC PATCH v3 0/3] sched: Skip queued wakeups only when L2 is
shared
On 8/25/23 06:11, Swapnil Sapkal wrote:
> Hello Mathieu,
>
> On 8/22/2023 5:01 PM, Mathieu Desnoyers wrote:
>> This series improves performance of scheduler wakeups on large systems
>> by skipping queued wakeups only when CPUs share their L2 cache, rather
>> than when they share their LLC.
>>
>> The speedup mainly reproduces on workloads which have at least *some*
>> idle time (because it significantly increases the number of migrations,
>> and thus remote wakeups), *and* it needs to have a sufficient load to
>> cause contention on the runqueue locks.
>>
>> Feedback is welcome,
>
> I ran some micro-benchmarks as part of testing this series. Here are the
> observations:
>
> - Hackbench shows improvement with this patch and Aaron's patch with
> 6.5-rc1 kernel as the baseline.
>
> - tbench and netperf shows shows some dip in performance with highly
> overloaded case.
>
> - Other micro-benchmarks shows more or less similar performance with
> these patches.
Those results look promising! Thanks for testing!
Mathieu
>
> o System Details
>
> - 4th Generation EPYC System
> - 2 x 128C/256T
> - NPS1 mode
>
> o Kernels
>
> base: 6.5.0-rc1
> base + mathieu-queued-wakeup: 6.5.0-rc1 + Mathieu's patches [1]
> base + aaron-tg-load-avg: 6.5.0-rc1 + Aaron's patch [2]
> base + queued-wakeup + tg-load-avg: 6.5.0-rc1 + Mathieu's patches
> [1] + Aaron's patch [2]
>
> [References]
>
> [1] "sched: Skip queued wakeups only when L2 is shared"
>
> (https://lore.kernel.org/all/20230822113133.643238-1-mathieu.desnoyers@efficios.com/)
> [2] "Reduce cost of accessing tg->load_avg"
>
> (https://lore.kernel.org/lkml/20230823060832.454842-1-aaron.lu@intel.com/)
>
> ==================================================================
> Test : hackbench
> Units : Time in seconds
> Interpretation: Lower is better
> Statistic : AMean
> ==================================================================
> Test: 6.5.0-rc1 (base) base + mathieu-queued-wakeup base
> + aaron-tg-load-avg base + queued-wakeup + tg-load-avg
> 1-groups: 22.15 (0.00 pct) 22.46 (-1.39 pct)
> 22.35 (-0.90 pct) 21.20 (4.28 pct)
> 2-groups: 22.76 (0.00 pct) 21.78 (4.30 pct)
> 22.60 (0.70 pct) 21.90 (3.77 pct)
> 4-groups: 22.12 (0.00 pct) 22.02 (0.45 pct)
> 22.22 (-0.45 pct) 21.94 (0.81 pct)
> 8-groups: 24.80 (0.00 pct) 22.36 (9.83 pct)
> 22.99 (7.29 pct) 22.00 (11.29 pct)
> 16-groups: 31.09 (0.00 pct) 21.56 (30.65 pct)
> 22.13 (28.81 pct) 20.60 (33.74 pct)
>
> ==================================================================
> Test : tbench
> Units : Throughput
> Interpretation: Higher is better
> Statistic : AMean
> ==================================================================
> Clients: 6.5.0-rc1 (base) base + mathieu-queued-wakeup base
> + aaron-tg-load-avg base + queued-wakeup + tg-load-avg
> 1 261.49 (0.00 pct) 261.18 (-0.11 pct)
> 262.29 (0.30 pct) 257.80 (-1.41 pct)
> 2 514.08 (0.00 pct) 521.30 (1.40 pct)
> 517.66 (0.69 pct) 510.96 (-0.60 pct)
> 4 1002.51 (0.00 pct) 988.81 (-1.36 pct)
> 995.04 (-0.74 pct) 987.74 (-1.47 pct)
> 8 1978.74 (0.00 pct) 1966.60 (-0.61 pct)
> 1991.85 (0.66 pct) 1941.39 (-1.88 pct)
> 16 3864.14 (0.00 pct) 3952.03 (2.27 pct)
> 3914.80 (1.31 pct) 3873.88 (0.25 pct)
> 32 7473.19 (0.00 pct) 7602.38 (1.72 pct)
> 7585.94 (1.50 pct) 7423.44 (-0.66 pct)
> 64 14335.10 (0.00 pct) 14313.17 (-0.15 pct)
> 14474.67 (0.97 pct) 14030.63 (-2.12 pct)
> 128 27275.73 (0.00 pct) 25176.80 (-7.69 pct)
> 28066.53 (2.89 pct) 25045.53 (-8.17 pct)
> 256 41688.17 (0.00 pct) 44373.40 (6.44 pct)
> 43779.37 (5.01 pct) 41427.00 (-0.62 pct)
> 512 137481.33 (0.00 pct) 136466.67 (-0.73 pct)
> 134824.00 (-1.93 pct) 141280.00 (2.76 pct)
> 1024 140534.00 (0.00 pct) 141916.33 (0.98 pct)
> 137008.33 (-2.50 pct) 126319.33 (-10.11 pct)
> 2048 145378.00 (0.00 pct) 145479.33 (0.06 pct)
> 138763.67 (-4.54 pct) 124471.00 (-14.38 pct)
>
> ==================================================================
> Test : netperf
> Units : Througput
> Interpretation: Higher is better
> Statistic : AMean
> ==================================================================
> 6.5.0-rc1 (base) base + mathieu-queued-wakeup
> base + aaron-tg-load-avg base + queued-wakeup + tg-load-avg
> 1-clients: 59642.88 (0.00 pct) 61647.37 (3.36
> pct) 61186.24 (2.58 pct) 59099.11 (-0.91 pct)
> 2-clients: 59349.65 (0.00 pct) 60896.01 (2.60
> pct) 60582.49 (2.07 pct) 62738.47 (5.70 pct)
> 4-clients: 59197.37 (0.00 pct) 60457.29 (2.12
> pct) 63042.52 (6.49 pct) 60879.58 (2.84 pct)
> 8-clients: 61977.66 (0.00 pct) 60389.92 (-2.56
> pct) 62078.15 (0.16 pct) 60314.65 (-2.68 pct)
> 16-clients: 61518.83 (0.00 pct) 61143.51 (-0.61
> pct) 60946.08 (-0.93 pct) 59388.78 (-3.46 pct)
> 32-clients: 58230.81 (0.00 pct) 58653.20 (0.72
> pct) 58594.14 (0.62 pct) 58188.52 (-0.07 pct)
> 64-clients: 58050.92 (0.00 pct) 57834.55 (-0.37
> pct) 58183.51 (0.22 pct) 57565.75 (-0.83 pct)
> 128-clients: 54324.55 (0.00 pct) 54385.60 (0.11
> pct) 54913.43 (1.08 pct) 53917.11 (-0.75 pct)
> 256-clients: 70155.29 (0.00 pct) 69390.68 (-1.08
> pct) 70097.50 (-0.08 pct) 64410.66 (-8.18 pct)
> 512-clients: 61511.77 (0.00 pct) 61480.99 (-0.05
> pct) 54493.82 (-11.40 pct) 46227.05 (-24.84 pct)
>
> ==================================================================
> Test : stream-10
> Units : Bandwidth, MB/s
> Interpretation: Higher is better
> Statistic : HMean
> ==================================================================
> Test: 6.5.0-rc1 (base) base + mathieu-queued-wakeup
> base + aaron-tg-load-avg base + queued-wakeup + tg-load-avg
> Copy: 353336.76 (0.00 pct) 352956.36 (-0.10 pct)
> 349583.67 (-1.06 pct) 351152.80 (-0.61 pct)
> Scale: 353474.88 (0.00 pct) 354582.35 (0.31 pct)
> 350543.75 (-0.82 pct) 353275.74 (-0.05 pct)
> Add: 371984.24 (0.00 pct) 372824.87 (0.22 pct)
> 369173.72 (-0.75 pct) 370483.63 (-0.40 pct)
> Triad: 372625.41 (0.00 pct) 278389.62 (-25.28 pct)
> 369504.06 (-0.83 pct) 369070.11 (-0.95 pct)
>
> ==================================================================
> Test : stream-100
> Units : Bandwidth, MB/s
> Interpretation: Higher is better
> Statistic : HMean
> ==================================================================
> Test: 6.5.0-rc1 (base) base + mathieu-queued-wakeup
> base + aaron-tg-load-avg base + queued-wakeup + tg-load-avg
> Copy: 353476.35 (0.00 pct) 354954.50 (0.41 pct)
> 354614.56 (0.32 pct) 353512.71 (0.01 pct)
> Scale: 353214.73 (0.00 pct) 354884.12 (0.47 pct)
> 355841.17 (0.74 pct) 353220.53 (0.00 pct)
> Add: 370755.48 (0.00 pct) 372292.72 (0.41 pct)
> 375307.35 (1.22 pct) 369917.77 (-0.22 pct)
> Triad: 370652.02 (0.00 pct) 372732.11 (0.56 pct)
> 375718.85 (1.36 pct) 369926.26 (-0.19 pct)
>
> ==================================================================
> Test : schbench (old)
> Units : 99th percentile latency in us
> Interpretation: Lower is better
> Statistic : Median
> ==================================================================
> #workers: 6.5.0-rc1 (base) base + mathieu-queued-wakeup base
> + aaron-tg-load-avg base + queued-wakeup + tg-load-avg
> 1: 56.00 (0.00 pct) 58.00 (-3.57
> pct) 60.00 (-7.14 pct) 60.00
> (-7.14 pct)
> 2: 61.00 (0.00 pct) 56.00 (8.19
> pct) 59.00 (3.27 pct) 60.00
> (1.63 pct)
> 4: 64.00 (0.00 pct) 62.00 (3.12
> pct) 66.00 (-3.12 pct) 64.00
> (0.00 pct)
> 8: 96.00 (0.00 pct) 78.00 (18.75
> pct) 76.00 (20.83 pct) 93.00
> (3.12 pct)
> 16: 98.00 (0.00 pct) 95.00 (3.06
> pct) 98.00 (0.00 pct) 95.00
> (3.06 pct)
> 32: 137.00 (0.00 pct) 144.00 (-5.10 pct)
> 133.00 (2.91 pct) 130.00 (5.10 pct)
> 64: 206.00 (0.00 pct) 210.00 (-1.94 pct)
> 200.00 (2.91 pct) 217.00 (-5.33 pct)
> 128: 348.00 (0.00 pct) 347.00 (0.28 pct)
> 413.00 (-18.67 pct) 366.00 (-5.17 pct)
> 256: 679.00 (0.00 pct) 669.00 (1.47 pct)
> 669.00 (1.47 pct) 675.00 (0.58 pct)
> 512: 1366.00 (0.00 pct) 1366.00 (0.00 pct)
> 1442.00 (-5.56 pct) 1430.00 (-4.68 pct)
>
>
> ==================================================================
> Test : schbench (new)
> Units : 99th percentile latency in us
> Interpretation: Lower is better
> Statistic : Median
> ==================================================================
> Metric: wakeup_lat_summary
> #workers: 6.5.0-rc1 (base) base + mathieu-queued-wakeup base
> + aaron-tg-load-avg base + queued-wakeup + tg-load-avg
> 1: 15.00 (0.00 pct) 15.00 (0.00
> pct) 16.00 (-6.66 pct) 17.00
> (-13.33 pct)
> 2: 16.00 (0.00 pct) 16.00 (0.00
> pct) 17.00 (-6.25 pct) 17.00
> (-6.25 pct)
> 4: 17.00 (0.00 pct) 17.00 (0.00
> pct) 15.00 (11.76 pct) 17.00
> (0.00 pct)
> 8: 11.00 (0.00 pct) 13.00 (-18.18
> pct) 11.00 (0.00 pct) 11.00 (0.00
> pct)
> 16: 11.00 (0.00 pct) 11.00 (0.00
> pct) 10.00 (9.09 pct) 9.00
> (18.18 pct)
> 32: 11.00 (0.00 pct) 11.00 (0.00
> pct) 11.00 (0.00 pct) 11.00
> (0.00 pct)
> 64: 10.00 (0.00 pct) 11.00 (-10.00
> pct) 10.00 (0.00 pct) 10.00 (0.00
> pct)
> 128: 11.00 (0.00 pct) 12.00 (-9.09 pct)
> 12.00 (-9.09 pct) 11.00 (0.00 pct)
> 256: 117.00 (0.00 pct) 162.00 (-38.46 pct)
> 90.00 (23.07 pct) 103.00 (11.96 pct)
> 512: 22496.00 (0.00 pct) 21664.00 (3.69 pct)
> 22368.00 (0.56 pct) 21408.00 (4.83 pct)
>
> Metric: request_lat_summary
> #workers: 6.5.0-rc1 (base) base + mathieu-queued-wakeup base
> + aaron-tg-load-avg base + queued-wakeup + tg-load-avg
> 1: 6872.00 (0.00 pct) 6872.00 (0.00 pct)
> 6792.00 (1.16 pct) 6856.00 (0.23 pct)
> 2: 6824.00 (0.00 pct) 6824.00 (0.00 pct)
> 6872.00 (-0.70 pct) 6856.00 (-0.46 pct)
> 4: 6824.00 (0.00 pct) 6808.00 (0.23 pct)
> 6872.00 (-0.70 pct) 6824.00 (0.00 pct)
> 8: 6824.00 (0.00 pct) 6824.00 (0.00 pct)
> 6872.00 (-0.70 pct) 6824.00 (0.00 pct)
> 16: 6824.00 (0.00 pct) 6840.00 (-0.23 pct)
> 6872.00 (-0.70 pct) 6840.00 (-0.23 pct)
> 32: 6840.00 (0.00 pct) 6840.00 (0.00 pct)
> 6888.00 (-0.70 pct) 6856.00 (-0.23 pct)
> 64: 6840.00 (0.00 pct) 6872.00 (-0.46 pct)
> 6888.00 (-0.70 pct) 6872.00 (-0.46 pct)
> 128: 12272.00 (0.00 pct) 12784.00 (-4.17 pct)
> 13200.00 (-7.56 pct) 12016.00 (2.08 pct)
> 256: 13328.00 (0.00 pct) 13392.00 (-0.48 pct)
> 13712.00 (-2.88 pct) 13552.00 (-1.68 pct)
> 512: 88832.00 (0.00 pct) 86400.00 (2.73 pct)
> 88192.00 (0.72 pct) 85632.00 (3.60 pct)
>
> Metric: rps_summary
> #workers: 6.5.0-rc1 (base) base + mathieu-queued-wakeup base
> + aaron-tg-load-avg base + queued-wakeup + tg-load-avg
> 1: 297.00 (0.00 pct) 297.00 (0.00 pct)
> 297.00 (0.00 pct) 299.00 (-0.67 pct)
> 2: 601.00 (0.00 pct) 603.00 (-0.33 pct)
> 595.00 (0.99 pct) 601.00 (0.00 pct)
> 4: 1206.00 (0.00 pct) 1206.00 (0.00 pct)
> 1190.00 (1.32 pct) 1206.00 (0.00 pct)
> 8: 2412.00 (0.00 pct) 2412.00 (0.00 pct)
> 2396.00 (0.66 pct) 2420.00 (-0.33 pct)
> 16: 4840.00 (0.00 pct) 4824.00 (0.33 pct)
> 4792.00 (0.99 pct) 4840.00 (0.00 pct)
> 32: 9648.00 (0.00 pct) 9648.00 (0.00 pct)
> 9584.00 (0.66 pct) 9680.00 (-0.33 pct)
> 64: 19360.00 (0.00 pct) 19296.00 (0.33 pct)
> 19168.00 (0.99 pct) 19296.00 (0.33 pct)
> 128: 37952.00 (0.00 pct) 35264.00 (7.08 pct)
> 36672.00 (3.37 pct) 38080.00 (-0.33 pct)
> 256: 41408.00 (0.00 pct) 41536.00 (-0.30 pct)
> 39744.00 (4.01 pct) 40896.00 (1.23 pct)
> 512: 36288.00 (0.00 pct) 36800.00 (-1.41 pct)
> 35264.00 (2.82 pct) 35776.00 (1.41 pct)
>
> Tested-by: Swapnil Sapkal <Swapnil.Sapkal@....com>
>
>>
>> Thanks,
>>
>> Mathieu
>>
>> Mathieu Desnoyers (3):
>> sched: Rename cpus_share_cache to cpus_share_llc
>> sched: Introduce cpus_share_l2c (v3)
>> sched: ttwu_queue_cond: skip queued wakeups across different l2 caches
>>
>> Cc: Ingo Molnar <mingo@...hat.com>
>> Cc: Peter Zijlstra <peterz@...radead.org>
>> Cc: Valentin Schneider <vschneid@...hat.com>
>> Cc: Steven Rostedt <rostedt@...dmis.org>
>> Cc: Ben Segall <bsegall@...gle.com>
>> Cc: Mel Gorman <mgorman@...e.de>
>> Cc: Daniel Bristot de Oliveira <bristot@...hat.com>
>> Cc: Vincent Guittot <vincent.guittot@...aro.org>
>> Cc: Juri Lelli <juri.lelli@...hat.com>
>> Cc: Swapnil Sapkal <Swapnil.Sapkal@....com>
>> Cc: Aaron Lu <aaron.lu@...el.com>
>> Cc: Julien Desfossez <jdesfossez@...italocean.com>
>> Cc: x86@...nel.org
>>
>> block/blk-mq.c | 2 +-
>> include/linux/sched/topology.h | 10 ++++++++--
>> kernel/sched/core.c | 14 +++++++++++---
>> kernel/sched/fair.c | 8 ++++----
>> kernel/sched/sched.h | 2 ++
>> kernel/sched/topology.c | 32 +++++++++++++++++++++++++++++---
>> 6 files changed, 55 insertions(+), 13 deletions(-)
>>
> --
> Thanks and Regards,
> Swapnil
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
Powered by blists - more mailing lists