lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240823185946.418340-1-mathieu.desnoyers@efficios.com>
Date: Fri, 23 Aug 2024 14:59:40 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...hat.com>
Cc: linux-kernel@...r.kernel.org,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	Valentin Schneider <vschneid@...hat.com>,
	Mel Gorman <mgorman@...e.de>,
	Steven Rostedt <rostedt@...dmis.org>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Ben Segall <bsegall@...gle.com>,
	Yury Norov <yury.norov@...il.com>,
	Rasmus Villemoes <linux@...musvillemoes.dk>,
	Shuah Khan <skhan@...uxfoundation.org>
Subject: [RFC PATCH v1 0/6] sched: NUMA-aware concurrency IDs

The issue addressed by this series is the non-locality of NUMA accesses
to data structures indexed by concurrency IDs: for example, in a
scenario where a process has two threads, and they periodically run one
after the other on different NUMA nodes, each will be assigned mm_cid=0.
As a consequence, they will end up accessing the same pages, and thus at
least one of the threads will need to perform remote NUMA accesses,
which is inefficient.

Solve this by making the rseq concurrency ID (mm_cid) NUMA-aware. On
NUMA systems, when a NUMA-aware concurrency ID is observed by user-space
to be associated with a NUMA node, guarantee that it never changes NUMA
node unless either a kernel-level NUMA configuration change happens, or
scheduler migrations end up migrating tasks across NUMA nodes.

There is a tradeoff between NUMA locality and compactness of the
concurrency ID allocation. Favor compactness over NUMA locality when
the scheduler migrates tasks across NUMA nodes, as this does not cause
the frequent remote NUMA accesses behavior. This is done by limiting the
concurrency ID range to minimum between the number of threads belonging
to the process and the number of allowed CPUs.

This series applies on top of v6.10.6.

Changes since v0: applied the changes requested by Yury Norov, and added
Reviewed-by tag from Shuah Khan for selftests changes. Rebased on
v6.10.6.

Cc: Valentin Schneider <vschneid@...hat.com>
Cc: Mel Gorman <mgorman@...e.de>
Cc: Steven Rostedt <rostedt@...dmis.org>
Cc: Vincent Guittot <vincent.guittot@...aro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@....com>
Cc: Ben Segall <bsegall@...gle.com>
Cc: Yury Norov <yury.norov@...il.com>
Cc: Rasmus Villemoes <linux@...musvillemoes.dk>
Cc: Shuah Khan <skhan@...uxfoundation.org>

Mathieu Desnoyers (6):
  lib: Clarify comment on top of find_next_andnot_bit
  lib: Implement find_{first,next,nth}_nor_bit, find_first_andnot_bit
  cpumask: Implement cpumask_{first,next}_{nor,andnot}
  sched: NUMA-aware per-memory-map concurrency IDs
  selftests/rseq: x86: Implement rseq_load_u32_u32
  selftests/rseq: Implement NUMA node id vs mm_cid invariant test

 include/linux/cpumask.h                       |  60 ++++++++
 include/linux/find.h                          | 119 ++++++++++++++-
 include/linux/mm_types.h                      |  57 ++++++-
 kernel/sched/core.c                           |  10 +-
 kernel/sched/sched.h                          | 139 +++++++++++++++--
 lib/find_bit.c                                |  36 +++++
 tools/testing/selftests/rseq/.gitignore       |   1 +
 tools/testing/selftests/rseq/Makefile         |   2 +-
 .../testing/selftests/rseq/basic_numa_test.c  | 144 ++++++++++++++++++
 tools/testing/selftests/rseq/rseq-x86-bits.h  |  43 ++++++
 tools/testing/selftests/rseq/rseq.h           |  14 ++
 11 files changed, 604 insertions(+), 21 deletions(-)
 create mode 100644 tools/testing/selftests/rseq/basic_numa_test.c

-- 
2.39.2

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ