[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <89bc1d11-c2bc-43d7-9a22-e159175706cc@efficios.com>
Date: Wed, 26 Mar 2025 10:33:46 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Gabriele Monaco <gmonaco@...hat.com>, linux-kernel@...r.kernel.org,
Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.org>
Cc: "Paul E. McKenney" <paulmck@...nel.org>, Shuah Khan <shuah@...nel.org>
Subject: Re: [PATCH v12 0/3] sched: Restructure task_mm_cid_work for
predictability
On 2025-03-26 03:31, Gabriele Monaco wrote:
> On Tue, 2025-03-11 at 07:28 +0100, Gabriele Monaco wrote:
>> This patchset moves the task_mm_cid_work to a preemptible and
>> migratable
>> context. This reduces the impact of this work to the scheduling
>> latency
>> of real time tasks.
>> The change makes the recurrence of the task a bit more predictable.
>>
>
> The series was review and, in my opinion, is ready for inclusion.
> Peter, Ingo, can we merge it?
I agree. I've reviewed the entire series a few weeks ago and it
looks good to me.
Thanks,
Mathieu
>
> Thanks,
> Gabriele
>
>> The behaviour causing latency was introduced in commit 223baf9d17f2
>> ("sched: Fix performance regression introduced by mm_cid") which
>> introduced a task work tied to the scheduler tick.
>> That approach presents two possible issues:
>> * the task work runs before returning to user and causes, in fact, a
>> scheduling latency (with order of magnitude significant in
>> PREEMPT_RT)
>> * periodic tasks with short runtime are less likely to run during the
>> tick, hence they might not run the task work at all
>>
>> Patch 1 add support for prev_sum_exec_runtime to the RT, deadline and
>> sched_ext classes as it is supported by fair, this is required to
>> avoid
>> calling rseq_preempt on tick if the runtime is below a threshold.
>>
>> Patch 2 contains the main changes, removing the task_work on the
>> scheduler tick and using a work_struct scheduled more reliably during
>> __rseq_handle_notify_resume.
>>
>> Patch 3 adds a selftest to validate the functionality of the
>> task_mm_cid_work (i.e. to compact the mm_cids).
>>
>> Changes since V11:
>> * Remove variable to make mm_cid_needs_scan more compact
>> * All patches reviewed
>>
>> Changes since V10:
>> * Fix compilation errors with RSEQ and/or MM_CID disabled
>>
>> Changes since V9:
>> * Simplify and move checks from task_queue_mm_cid to its call site
>>
>> Changes since V8 [1]:
>> * Add support for prev_sum_exec_runtime to RT, deadline and sched_ext
>> * Avoid rseq_preempt on ticks unless executing for more than 100ms
>> * Queue the work on the unbound workqueue
>>
>> Changes since V7:
>> * Schedule mm_cid compaction and update at every tick too
>> * mmgrab before scheduling the work
>>
>> Changes since V6 [2]:
>> * Switch to a simple work_struct instead of a delayed work
>> * Schedule the work_struct in __rseq_handle_notify_resume
>> * Asynchronously disable the work but make sure mm is there while we
>> run
>> * Remove first patch as merged independently
>> * Fix commit tag for test
>>
>> Changes since V5:
>> * Punctuation
>>
>> Changes since V4 [3]:
>> * Fixes on the selftest
>> * Polished memory allocation and cleanup
>> * Handle the test failure in main
>>
>> Changes since V3 [4]:
>> * Fixes on the selftest
>> * Minor style issues in comments and indentation
>> * Use of perror where possible
>> * Add a barrier to align threads execution
>> * Improve test failure and error handling
>>
>> Changes since V2 [5]:
>> * Change the order of the patches
>> * Merge patches changing the main delayed_work logic
>> * Improved self-test to spawn 1 less thread and use the main one
>> instead
>>
>> Changes since V1 [6]:
>> * Re-arm the delayed_work at each invocation
>> * Cancel the work synchronously at mmdrop
>> * Remove next scan fields and completely rely on the delayed_work
>> * Shrink mm_cid allocation with nr thread/affinity (Mathieu
>> Desnoyers)
>> * Add self test
>>
>> [1] -
>> https://lore.kernel.org/lkml/20250220102639.141314-1-gmonaco@redhat.com
>> [2] -
>> https://lore.kernel.org/lkml/20250210153253.460471-1-gmonaco@redhat.com
>> [3] -
>> https://lore.kernel.org/lkml/20250113074231.61638-4-gmonaco@redhat.com
>> [4] -
>> https://lore.kernel.org/lkml/20241216130909.240042-1-gmonaco@redhat.com
>> [5] -
>> https://lore.kernel.org/lkml/20241213095407.271357-1-gmonaco@redhat.com
>> [6] -
>> https://lore.kernel.org/lkml/20241205083110.180134-2-gmonaco@redhat.com
>>
>> To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
>> To: Peter Zijlstra <peterz@...radead.org>
>> To: Ingo Molnar <mingo@...hat.org>
>> To: Paul E. McKenney <paulmck@...nel.org>
>> To: Shuah Khan <shuah@...nel.org>
>>
>> Gabriele Monaco (3):
>> sched: Add prev_sum_exec_runtime support for RT, DL and SCX classes
>> sched: Move task_mm_cid_work to mm work_struct
>> selftests/rseq: Add test for mm_cid compaction
>>
>> include/linux/mm_types.h | 17 ++
>> include/linux/rseq.h | 13 ++
>> include/linux/sched.h | 7 +-
>> kernel/rseq.c | 2 +
>> kernel/sched/core.c | 43 ++--
>> kernel/sched/deadline.c | 1 +
>> kernel/sched/ext.c | 1 +
>> kernel/sched/rt.c | 1 +
>> kernel/sched/sched.h | 2 -
>> tools/testing/selftests/rseq/.gitignore | 1 +
>> tools/testing/selftests/rseq/Makefile | 2 +-
>> .../selftests/rseq/mm_cid_compaction_test.c | 200
>> ++++++++++++++++++
>> 12 files changed, 258 insertions(+), 32 deletions(-)
>> create mode 100644
>> tools/testing/selftests/rseq/mm_cid_compaction_test.c
>>
>>
>> base-commit: 80e54e84911a923c40d7bee33a34c1b4be148d7a
>
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
Powered by blists - more mailing lists