[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250224132836.383041-1-gmonaco@redhat.com>
Date: Mon, 24 Feb 2025 14:28:32 +0100
From: Gabriele Monaco <gmonaco@...hat.com>
To: linux-kernel@...r.kernel.org,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.org>,
"Paul E. McKenney" <paulmck@...nel.org>,
Shuah Khan <shuah@...nel.org>
Cc: Gabriele Monaco <gmonaco@...hat.com>
Subject: [PATCH v9 0/3] sched: Restructure task_mm_cid_work for predictability
This patchset moves the task_mm_cid_work to a preemptible and migratable
context. This reduces the impact of this work to the scheduling latency
of real time tasks.
The change makes the recurrence of the task a bit more predictable.
The behaviour causing latency was introduced in commit 223baf9d17f2
("sched: Fix performance regression introduced by mm_cid") which
introduced a task work tied to the scheduler tick.
That approach presents two possible issues:
* the task work runs before returning to user and causes, in fact, a
scheduling latency (with order of magnitude significant in PREEMPT_RT)
* periodic tasks with short runtime are less likely to run during the
tick, hence they might not run the task work at all
Patch 1 add support for prev_sum_exec_runtime to the RT, deadline and
sched_ext classes as it is supported by fair, this is required to avoid
calling rseq_preempt on tick if the runtime is below a threshold.
Patch 2 contains the main changes, removing the task_work on the
scheduler tick and using a work_struct scheduled more reliably during
__rseq_handle_notify_resume.
Patch 3 adds a selftest to validate the functionality of the
task_mm_cid_work (i.e. to compact the mm_cids).
Changes since V8 [1]:
* Add support for prev_sum_exec_runtime to RT, deadline and sched_ext
* Avoid rseq_preempt on ticks unless executing for more than 100ms
* Queue the work on the unbound workqueue
Changes since V7:
* Schedule mm_cid compaction and update at every tick too
* mmgrab before scheduling the work
Changes since V6 [2]:
* Switch to a simple work_struct instead of a delayed work
* Schedule the work_struct in __rseq_handle_notify_resume
* Asynchronously disable the work but make sure mm is there while we run
* Remove first patch as merged independently
* Fix commit tag for test
Changes since V5:
* Punctuation
Changes since V4 [3]:
* Fixes on the selftest
* Polished memory allocation and cleanup
* Handle the test failure in main
Changes since V3 [4]:
* Fixes on the selftest
* Minor style issues in comments and indentation
* Use of perror where possible
* Add a barrier to align threads execution
* Improve test failure and error handling
Changes since V2 [5]:
* Change the order of the patches
* Merge patches changing the main delayed_work logic
* Improved self-test to spawn 1 less thread and use the main one instead
Changes since V1 [6]:
* Re-arm the delayed_work at each invocation
* Cancel the work synchronously at mmdrop
* Remove next scan fields and completely rely on the delayed_work
* Shrink mm_cid allocation with nr thread/affinity (Mathieu Desnoyers)
* Add self test
[1] - https://lore.kernel.org/lkml/20250220102639.141314-1-gmonaco@redhat.com
[2] - https://lore.kernel.org/lkml/20250210153253.460471-1-gmonaco@redhat.com
[3] - https://lore.kernel.org/lkml/20250113074231.61638-4-gmonaco@redhat.com
[4] - https://lore.kernel.org/lkml/20241216130909.240042-1-gmonaco@redhat.com
[5] - https://lore.kernel.org/lkml/20241213095407.271357-1-gmonaco@redhat.com
[6] - https://lore.kernel.org/lkml/20241205083110.180134-2-gmonaco@redhat.com
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Peter Zijlstra <peterz@...radead.org>
To: Ingo Molnar <mingo@...hat.org>
To: Paul E. McKenney <paulmck@...nel.org>
To: Shuah Khan <shuah@...nel.org>
Gabriele Monaco (3):
sched: Add prev_sum_exec_runtime support for RT, DL and SCX classes
sched: Move task_mm_cid_work to mm work_struct
selftests/rseq: Add test for mm_cid compaction
include/linux/mm_types.h | 8 +
include/linux/rseq.h | 2 +
include/linux/sched.h | 7 +-
kernel/rseq.c | 1 +
kernel/sched/core.c | 42 ++--
kernel/sched/deadline.c | 1 +
kernel/sched/ext.c | 1 +
kernel/sched/rt.c | 1 +
kernel/sched/sched.h | 2 -
tools/testing/selftests/rseq/.gitignore | 1 +
tools/testing/selftests/rseq/Makefile | 2 +-
.../selftests/rseq/mm_cid_compaction_test.c | 200 ++++++++++++++++++
12 files changed, 241 insertions(+), 27 deletions(-)
create mode 100644 tools/testing/selftests/rseq/mm_cid_compaction_test.c
base-commit: d082ecbc71e9e0bf49883ee4afd435a77a5101b6
--
2.48.1
Powered by blists - more mailing lists