[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250516031814.1870508-1-jstultz@google.com>
Date: Fri, 16 May 2025 03:17:47 +0000
From: John Stultz <jstultz@...gle.com>
To: LKML <linux-kernel@...r.kernel.org>
Cc: John Stultz <jstultz@...gle.com>, Joel Fernandes <joelagnelf@...dia.com>,
Qais Yousef <qyousef@...alina.io>, Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>, Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>, Dietmar Eggemann <dietmar.eggemann@....com>,
Valentin Schneider <vschneid@...hat.com>, Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Zimuzo Ezeozue <zezeozue@...gle.com>, Mel Gorman <mgorman@...e.de>,
Will Deacon <will@...nel.org>, Waiman Long <longman@...hat.com>, Boqun Feng <boqun.feng@...il.com>,
"Paul E. McKenney" <paulmck@...nel.org>, Metin Kaya <Metin.Kaya@....com>,
Xuewen Yan <xuewen.yan94@...il.com>, K Prateek Nayak <kprateek.nayak@....com>,
Thomas Gleixner <tglx@...utronix.de>, Daniel Lezcano <daniel.lezcano@...aro.org>,
Suleiman Souhlal <suleiman@...gle.com>, kernel-team@...roid.com
Subject: [PATCH v17 0/8] Single RunQueue Proxy Execution (v17)
Hey All,
Many many thanks to Peter, Prateek, Metin and Juri for their
helpful feedback on the last revision. I got a little
side-tracked on some other things, so this took a bit longer to
send out than I had hoped. I have tried to integrate much of the
changes suggested, but as always I may have missed things in all
the great feedback, please let me know if you find anything.
So with that out of the way, here is v17 of the Proxy Execution
series, a generalized form of priority inheritance.
As I’m trying to submit this work in smallish digestible pieces,
in this series, I’m only submitting for review the logic that
allows us to do the proxying if the lock owner is on the same
runqueue as the blocked waiter. Introducing the
CONFIG_SCHED_PROXY_EXEC option and boot-argument, reworking the
task_struct::blocked_on pointer and wrapper functions, the
initial sketch of the find_proxy_task() logic, some fixes for
using split contexts, and finally same-runqueue proxying.
The biggest change with v17 is work to fix an issue Peter
pointed out about thread-group cpu time accounting. I’ve added
an additional patch to reorganize some logic, and included a fix
along with additional comments to try to make it clear. Please
let me know if there are still concerns here.
I’ve also continued working on the rest of the series, which you
can find here:
https://github.com/johnstultz-work/linux-dev/commits/proxy-exec-v17-6.15-rc6
https://github.com/johnstultz-work/linux-dev.git proxy-exec-v17-6.15-rc6
New changes:
* A number of improvements to the commit messages and comments
suggested by Juri Lelli and Peter Zijlstra
* Added missing logic to put_prev_task_dl as pointed out by
K Prateek Nayak
* Add lockdep_assert_held_once and drop the READ_ONCE in
__get_task_blocked_on(), as suggested by Juri Lelli
* Introduced a new patch to move update_curr_task logic into
update_curr_se to simplify things
* Renamed update_se_times to update_se, as suggested by Peter
* Reworked logic to fix an issue Peter pointed out with thread
group accounting being done on the donor, rather than the
running execution context.
* Fixed typos caught by Metin Kaya
* Reworked commit messages so Cc: list is below the fold, as
Peter seems to prefer that when applying commits
Issues still to address with the full series:
* Peter suggested an idea that instead of when tasks become
unblocked, using (blocked_on_state == BO_WAKING) to protect
against running proxy-migrated tasks on cpu’s they are not
affined to, we could dequeue tasks first and then wake them.
This does look to be cleaner in many ways, but the locking
rework is significant and I’ve not worked out all the kinks
with it yet. I am also a little worried that we may trip other
wakeup paths that might not do the dequeue first. However, I
have adopted this approach for the find_proxy_task() forced
return migration, and it’s working well.
* The new rework using guard() cleans up a lot of things, but
there are some edge cases where we change blocked_on locks, or
need to drop locks to do migration, so there still are some
odd goto exit cases needed to get out of the guard scope.
Ideas for further cleanups would be welcome here.
* Need to sort out what is needed for sched_ext to be ok with
proxy-execution enabled.
* K Prateek Nayak did some testing about a bit over a year ago
with an earlier version of the series and saw ~3-5%
regressions in some cases. I’m hoping to look into this soon
to see if we can reduce those further.
* The chain migration functionality needs further iterations and
better validation to ensure it truly maintains the RT/DL load
balancing invariants (despite this being broken in vanilla
upstream with RT_PUSH_IPI currently)
I’d really appreciate any feedback or review thoughts on the
full series as well. I’m trying to keep the chunks small,
reviewable and iteratively testable, but if you have any
suggestions on how to improve the series, I’m all ears.
Credit/Disclaimer:
—--------------------
As mentioned previously, this Proxy Execution series has a long
history:
First described in a paper[1] by Watkins, Straub, Niehaus, then
from patches from Peter Zijlstra, extended with lots of work by
Juri Lelli, Valentin Schneider, and Connor O'Brien. (and thank
you to Steven Rostedt for providing additional details here!)
So again, many thanks to those above, as all the credit for this
series really is due to them - while the mistakes are likely
mine.
Thanks so much!
-john
[1] https://static.lwn.net/images/conf/rtlws11/papers/proc/p38.pdf
Cc: Joel Fernandes <joelagnelf@...dia.com>
Cc: Qais Yousef <qyousef@...alina.io>
Cc: Ingo Molnar <mingo@...hat.com>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Juri Lelli <juri.lelli@...hat.com>
Cc: Vincent Guittot <vincent.guittot@...aro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@....com>
Cc: Valentin Schneider <vschneid@...hat.com>
Cc: Steven Rostedt <rostedt@...dmis.org>
Cc: Ben Segall <bsegall@...gle.com>
Cc: Zimuzo Ezeozue <zezeozue@...gle.com>
Cc: Mel Gorman <mgorman@...e.de>
Cc: Will Deacon <will@...nel.org>
Cc: Waiman Long <longman@...hat.com>
Cc: Boqun Feng <boqun.feng@...il.com>
Cc: "Paul E. McKenney" <paulmck@...nel.org>
Cc: Metin Kaya <Metin.Kaya@....com>
Cc: Xuewen Yan <xuewen.yan94@...il.com>
Cc: K Prateek Nayak <kprateek.nayak@....com>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: Daniel Lezcano <daniel.lezcano@...aro.org>
Cc: Suleiman Souhlal <suleiman@...gle.com>
Cc: kernel-team@...roid.com
John Stultz (4):
sched: Add CONFIG_SCHED_PROXY_EXEC & boot argument to enable/disable
sched: Move update_curr_task logic into update_curr_se
sched: Fix runtime accounting w/ split exec & sched contexts
sched: Add an initial sketch of the find_proxy_task() function
Peter Zijlstra (2):
locking/mutex: Rework task_struct::blocked_on
sched: Start blocked_on chain processing in find_proxy_task()
Valentin Schneider (2):
locking/mutex: Add p->blocked_on wrappers for correctness checks
sched: Fix proxy/current (push,pull)ability
.../admin-guide/kernel-parameters.txt | 5 +
include/linux/sched.h | 72 ++++-
init/Kconfig | 12 +
kernel/fork.c | 3 +-
kernel/locking/mutex-debug.c | 9 +-
kernel/locking/mutex.c | 18 ++
kernel/locking/mutex.h | 3 +-
kernel/locking/ww_mutex.h | 16 +-
kernel/sched/core.c | 258 +++++++++++++++++-
kernel/sched/deadline.c | 7 +
kernel/sched/fair.c | 65 +++--
kernel/sched/rt.c | 5 +
kernel/sched/sched.h | 22 +-
13 files changed, 449 insertions(+), 46 deletions(-)
--
2.49.0.1101.gccaa498523-goog
Powered by blists - more mailing lists