[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241125195204.2374458-1-jstultz@google.com>
Date: Mon, 25 Nov 2024 11:51:54 -0800
From: John Stultz <jstultz@...gle.com>
To: LKML <linux-kernel@...r.kernel.org>
Cc: John Stultz <jstultz@...gle.com>, Joel Fernandes <joelaf@...gle.com>,
Qais Yousef <qyousef@...alina.io>, Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>, Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>, Dietmar Eggemann <dietmar.eggemann@....com>,
Valentin Schneider <vschneid@...hat.com>, Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Zimuzo Ezeozue <zezeozue@...gle.com>, Mel Gorman <mgorman@...e.de>,
Will Deacon <will@...nel.org>, Waiman Long <longman@...hat.com>, Boqun Feng <boqun.feng@...il.com>,
"Paul E. McKenney" <paulmck@...nel.org>, Metin Kaya <Metin.Kaya@....com>,
Xuewen Yan <xuewen.yan94@...il.com>, K Prateek Nayak <kprateek.nayak@....com>,
Thomas Gleixner <tglx@...utronix.de>, Daniel Lezcano <daniel.lezcano@...aro.org>, kernel-team@...roid.com
Subject: [RFC][PATCH v14 0/7] Single CPU Proxy Execution (v14)
Hey All,
After sending out the last revision, I got some feedback that
pointed out I hadn’t done much testing with the now upstream
enabled PREEMPT_RT option, resulting in a few build issues. So I
wanted to send out the next iteration Proxy Execution - an
approach for a generalized form of priority inheritance.
In this series, I’m only submitting the logic to support Proxy
Execution as both a build and runtime option, the mutex
blocked_on rework, some small fixes for assumptions that proxy
changes, and the initial logic to run lock owners in place of
the waiting task on the same cpu.
With v14 of this set, most of the changes are just build fixups
related to PREEMPT_RT. With PREEMPT_RT the abstraction around
mutexes means accessing the mutex owner runs into some trouble
with the rt_mutex as the underlying structure changes, so I need
to do some further work abstracting how I access that to get it
working. So for now, I’ve just made SCHED_PROXY_EXEC option
exclusive to PREEMPT_RT.
I’ve also continued working on the rest of the series, which you
can find here:
https://github.com/johnstultz-work/linux-dev/commits/proxy-exec-v14-6.13-rc1/
https://github.com/johnstultz-work/linux-dev.git proxy-exec-v14-6.13-rc1
New changes in the full series include:
* Rework of sleeping_owner handling so that we properly deal
with delayed-dequeued (sched_delayed) tasks (also removes now
unused proxy_deactivate() logic)
* Improving edge cases in ttwu where we wouldn’t set the task
as BO_RUNNABLE
* Making sure we call block_task() last in proxy_enqueue_on_owner
and not touch it again to avoid races where it might be
activated on another cpu
* Make sure we always activate blocked_entities when we exit
from ttwu
* Fix to enqueue the last task in the chain (p) on the blocked
owner instead of donor, so that we preserve the chain
structure, so mid-chain wakeups propagate properly
Issues still to address with the full series:
* While I think I’ve now properly handled delayed dequeued tasks,
I’d still appreciate any input on ways of better generalizing
these multiple approaches to having un-runnable blocked tasks
remaining on the runqueue.
* Even with some of the fixes in this version (and again, for
clarity not with this same-rq proxying series I’m sending out
here), I still have to include some workarounds to avoid hitting
some rare cases of what seem to be lost wakeups, where a task
was marked as BO_WAKING, but then ttwu never managed to
transition it to BO_RUNNABLE. The workarounds handle doing the
return migration from find_proxy_task() but I still feel that
those fixups shouldn’t be necessary, so I suspect the ttwu
logic has a race somewhere I’m missing.
* K Prateek Nayak did some testing about a year ago with an
earlier version of the series and saw ~3-5% regressions in
some cases. I’m hoping to look into this soon to see if we
can reduce those further.
* The chain migration functionality needs further iterations
and better validation to ensure it truly maintains the RT/DL
load balancing invariants (despite this being broken in
vanilla upstream with RT_PUSH_IPI currently)
I’d really appreciate any feedback or review thoughts on this
series! I’m trying to keep the chunks small, reviewable and
iteratively testable, but if you have any suggestions on how to
improve the series, I’m all ears.
Credit/Disclaimer:
—--------------------
As mentioned previously, this Proxy Execution series has a long
history:
First described in a paper[1] by Watkins, Straub, Niehaus, then
from patches from Peter Zijlstra, extended with lots of work by
Juri Lelli, Valentin Schneider, and Connor O'Brien. (and thank
you to Steven Rostedt for providing additional details here!)
So again, many thanks to those above, as all the credit for this
series really is due to them - while the mistakes are likely
mine.
Thanks so much!
-john
[1] https://static.lwn.net/images/conf/rtlws11/papers/proc/p38.pdf
Cc: Joel Fernandes <joelaf@...gle.com>
Cc: Qais Yousef <qyousef@...alina.io>
Cc: Ingo Molnar <mingo@...hat.com>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Juri Lelli <juri.lelli@...hat.com>
Cc: Vincent Guittot <vincent.guittot@...aro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@....com>
Cc: Valentin Schneider <vschneid@...hat.com>
Cc: Steven Rostedt <rostedt@...dmis.org>
Cc: Ben Segall <bsegall@...gle.com>
Cc: Zimuzo Ezeozue <zezeozue@...gle.com>
Cc: Mel Gorman <mgorman@...e.de>
Cc: Will Deacon <will@...nel.org>
Cc: Waiman Long <longman@...hat.com>
Cc: Boqun Feng <boqun.feng@...il.com>
Cc: "Paul E. McKenney" <paulmck@...nel.org>
Cc: Metin Kaya <Metin.Kaya@....com>
Cc: Xuewen Yan <xuewen.yan94@...il.com>
Cc: K Prateek Nayak <kprateek.nayak@....com>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: Daniel Lezcano <daniel.lezcano@...aro.org>
Cc: kernel-team@...roid.com
John Stultz (4):
sched: Add CONFIG_SCHED_PROXY_EXEC & boot argument to enable/disable
sched: Fix runtime accounting w/ split exec & sched contexts
sched: Fix psi_dequeue for Proxy Execution
sched: Add an initial sketch of the find_proxy_task() function
Peter Zijlstra (2):
locking/mutex: Rework task_struct::blocked_on
sched: Start blocked_on chain processing in find_proxy_task()
Valentin Schneider (1):
sched: Fix proxy/current (push,pull)ability
.../admin-guide/kernel-parameters.txt | 5 +
include/linux/sched.h | 79 ++++-
init/Kconfig | 9 +
init/init_task.c | 1 +
kernel/fork.c | 4 +-
kernel/locking/mutex-debug.c | 9 +-
kernel/locking/mutex.c | 40 ++-
kernel/locking/mutex.h | 3 +-
kernel/locking/ww_mutex.h | 24 +-
kernel/sched/core.c | 300 +++++++++++++++++-
kernel/sched/fair.c | 31 +-
kernel/sched/rt.c | 15 +-
kernel/sched/sched.h | 22 +-
kernel/sched/stats.h | 6 +-
14 files changed, 511 insertions(+), 37 deletions(-)
--
2.47.0.371.ga323438b13-goog
Powered by blists - more mailing lists