lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250716160603.138385-6-gmonaco@redhat.com>
Date: Wed, 16 Jul 2025 18:06:04 +0200
From: Gabriele Monaco <gmonaco@...hat.com>
To: linux-kernel@...r.kernel.org,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...hat.org>
Cc: Gabriele Monaco <gmonaco@...hat.com>
Subject: [PATCH v2 0/4] sched: Run task_mm_cid_work in batches to lower latency

This V2 of [1] is a continuation of [2] but using a simpler approach.
The task_mm_cid_work runs as a task_work returning to userspace and
causes a non-negligible scheduling latency, mostly due to its iterations
over all cores.

Split the work into several batches, each call to task_mm_cid_work will
not run for all cpus but just for a configurable number of cpus. Next
runs will pick up where the previous left off.
The mechanism that avoids running too frequently (100ms) is enforced
only when finishing all cpus, that is when starting from 0.

Also improve the predictability of the scan on short running tasks by
running it from rseq_handle_notify_resume, which runs it on every task
switch (similar behaviour to [2]), the same workaround on the tick for
long running tasks seen in [2] was ported also here.

Patch 1 add support for prev_sum_exec_runtime to the RT, deadline and
sched_ext classes as it is supported by fair, this is required to avoid
calling rseq_preempt on tick if the runtime is below a threshold.

Patch 2 moves the directly calls task_mm_cid_work instead of relying on
a task_work, necessary to avoid rseq_handle_notify_resume being called
twice while enqueuing a task_work.

Patch 3 splits the work into batches.

Patch 4 adds a selftest to validate the functionality of the
task_mm_cid_work (i.e. to compact the mm_cids).

Changes since V1 [1]:
* Use cpu_possible_mask in scan.
* Make sure batches have the same number of CPUs also if mask is sparse.
* Run the task on rseq_handle_notify_resume as in [2] but call directly.
* Schedule the work and mm_cid update on tick for long running tasks.
* Fix condition for need_scan only on first batch.
* Change RSEQ_CID_SCAN_BATCH default to be a power of 2.
* Rebase selftest on [2].
* Increase the selftest timeout on large systems.

To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Peter Zijlstra <peterz@...radead.org>
To: Ingo Molnar <mingo@...hat.org>

[1] - https://lore.kernel.org/lkml/20250217112317.258716-1-gmonaco@redhat.com
[2] - https://lore.kernel.org/lkml/20250707144824.117014-1-gmonaco@redhat.com

Gabriele Monaco (4):
  sched: Add prev_sum_exec_runtime support for RT, DL and SCX classes
  rseq: Run the mm_cid_compaction from rseq_handle_notify_resume()
  sched: Compact RSEQ concurrency IDs in batches
  selftests/rseq: Add test for mm_cid compaction

 include/linux/mm.h                            |   2 +
 include/linux/mm_types.h                      |  26 +++
 include/linux/sched.h                         |   2 +-
 init/Kconfig                                  |  12 ++
 kernel/rseq.c                                 |   2 +
 kernel/sched/core.c                           |  92 ++++++--
 kernel/sched/deadline.c                       |   1 +
 kernel/sched/ext.c                            |   1 +
 kernel/sched/rt.c                             |   1 +
 kernel/sched/sched.h                          |   2 +
 tools/testing/selftests/rseq/.gitignore       |   1 +
 tools/testing/selftests/rseq/Makefile         |   2 +-
 .../selftests/rseq/mm_cid_compaction_test.c   | 204 ++++++++++++++++++
 13 files changed, 323 insertions(+), 25 deletions(-)
 create mode 100644 tools/testing/selftests/rseq/mm_cid_compaction_test.c


base-commit: 155a3c003e555a7300d156a5252c004c392ec6b0
-- 
2.50.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ