[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20230121042436.2661843-1-yury.norov@gmail.com>
Date: Fri, 20 Jan 2023 20:24:27 -0800
From: Yury Norov <yury.norov@...il.com>
To: linux-kernel@...r.kernel.org,
"David S. Miller" <davem@...emloft.net>,
Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
Barry Song <baohua@...nel.org>,
Ben Segall <bsegall@...gle.com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Gal Pressman <gal@...dia.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Haniel Bristot de Oliveira <bristot@...hat.com>,
Heiko Carstens <hca@...ux.ibm.com>,
Ingo Molnar <mingo@...hat.com>,
Jacob Keller <jacob.e.keller@...el.com>,
Jakub Kicinski <kuba@...nel.org>,
Jason Gunthorpe <jgg@...dia.com>,
Jesse Brandeburg <jesse.brandeburg@...el.com>,
Jonathan Cameron <Jonathan.Cameron@...wei.com>,
Juri Lelli <juri.lelli@...hat.com>,
Leon Romanovsky <leonro@...dia.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Mel Gorman <mgorman@...e.de>,
Peter Lafreniere <peter@...jl.ca>,
Peter Zijlstra <peterz@...radead.org>,
Rasmus Villemoes <linux@...musvillemoes.dk>,
Saeed Mahameed <saeedm@...dia.com>,
Steven Rostedt <rostedt@...dmis.org>,
Tariq Toukan <tariqt@...dia.com>,
Tariq Toukan <ttoukan.linux@...il.com>,
Tony Luck <tony.luck@...el.com>,
Valentin Schneider <vschneid@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>
Cc: Yury Norov <yury.norov@...il.com>, linux-crypto@...r.kernel.org,
netdev@...r.kernel.org, linux-rdma@...r.kernel.org
Subject: [PATCH RESEND 0/9] sched: cpumask: improve on cpumask_local_spread() locality
cpumask_local_spread() currently checks local node for presence of i'th
CPU, and then if it finds nothing makes a flat search among all non-local
CPUs. We can do it better by checking CPUs per NUMA hops.
This has significant performance implications on NUMA machines, for example
when using NUMA-aware allocated memory together with NUMA-aware IRQ
affinity hints.
Performance tests from patch 8 of this series for mellanox network
driver show:
TCP multi-stream, using 16 iperf3 instances pinned to 16 cores (with aRFS on).
Active cores: 64,65,72,73,80,81,88,89,96,97,104,105,112,113,120,121
+-------------------------+-----------+------------------+------------------+
| | BW (Gbps) | TX side CPU util | RX side CPU util |
+-------------------------+-----------+------------------+------------------+
| Baseline | 52.3 | 6.4 % | 17.9 % |
+-------------------------+-----------+------------------+------------------+
| Applied on TX side only | 52.6 | 5.2 % | 18.5 % |
+-------------------------+-----------+------------------+------------------+
| Applied on RX side only | 94.9 | 11.9 % | 27.2 % |
+-------------------------+-----------+------------------+------------------+
| Applied on both sides | 95.1 | 8.4 % | 27.3 % |
+-------------------------+-----------+------------------+------------------+
Bottleneck in RX side is released, reached linerate (~1.8x speedup).
~30% less cpu util on TX.
This series was supposed to be included in v6.2, but that didn't happen. It
spent enough in -next without any issues, so I hope we'll finally see it
in v6.3.
I believe, the best way would be moving it with scheduler patches, but I'm
OK to try again with bitmap branch as well.
Tariq Toukan (1):
net/mlx5e: Improve remote NUMA preferences used for the IRQ affinity
hints
Valentin Schneider (2):
sched/topology: Introduce sched_numa_hop_mask()
sched/topology: Introduce for_each_numa_hop_mask()
Yury Norov (6):
lib/find: introduce find_nth_and_andnot_bit
cpumask: introduce cpumask_nth_and_andnot
sched: add sched_numa_find_nth_cpu()
cpumask: improve on cpumask_local_spread() locality
lib/cpumask: reorganize cpumask_local_spread() logic
lib/cpumask: update comment for cpumask_local_spread()
drivers/net/ethernet/mellanox/mlx5/core/eq.c | 18 +++-
include/linux/cpumask.h | 20 +++++
include/linux/find.h | 33 +++++++
include/linux/topology.h | 33 +++++++
kernel/sched/topology.c | 90 ++++++++++++++++++++
lib/cpumask.c | 52 ++++++-----
lib/find_bit.c | 9 ++
7 files changed, 230 insertions(+), 25 deletions(-)
--
2.34.1
Powered by blists - more mailing lists