[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241106025656.2326794-1-jstultz@google.com>
Date: Tue, 5 Nov 2024 18:56:40 -0800
From: John Stultz <jstultz@...gle.com>
To: LKML <linux-kernel@...r.kernel.org>
Cc: John Stultz <jstultz@...gle.com>, Joel Fernandes <joelaf@...gle.com>,
Qais Yousef <qyousef@...alina.io>, Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>, Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>, Dietmar Eggemann <dietmar.eggemann@....com>,
Valentin Schneider <vschneid@...hat.com>, Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Zimuzo Ezeozue <zezeozue@...gle.com>, Mel Gorman <mgorman@...e.de>,
Will Deacon <will@...nel.org>, Waiman Long <longman@...hat.com>, Boqun Feng <boqun.feng@...il.com>,
"Paul E. McKenney" <paulmck@...nel.org>, Metin Kaya <Metin.Kaya@....com>,
Xuewen Yan <xuewen.yan94@...il.com>, K Prateek Nayak <kprateek.nayak@....com>,
Thomas Gleixner <tglx@...utronix.de>, Daniel Lezcano <daniel.lezcano@...aro.org>, kernel-team@...roid.com
Subject: [RFC][PATCH v13 0/7] Single CPU Proxy Execution (v13)
Hey All,
Since the earlier proxy-execution preparation patches have
been queued in tip/sched/core, I wanted to send out the next
chunk of Proxy Execution - an approach for a generalized form of
priority inheritance.
In this series, I’m only submitting the logic to support Proxy
Execution as both a build and runtime option, the mutex
blocked_on rework, some small fixes for assumptions that proxy
changes, and the initial logic to run lock owners in place of
the waiting task on the same cpu.
With v13 of this series, there have been quite a number of
changes:
* Mostly dealing with collisions from changes that landed in
6.12-rc1
* The most basic of handling of delayed dequeued tasks (just
deactivate for now)
* Renaming “next” as “donor” to clarify things in proxy related
functions
* Lots of cleanups
I’ve also continued working on the rest of the series, which
you can find here:
https://github.com/johnstultz-work/linux-dev/commits/proxy-exec-v13-6.12-rc6
https://github.com/johnstultz-work/linux-dev.git proxy-exec-v13-6.12-rc6
New changes in the full series include:
* After talking with Juri at LPC, he suggested for now that we
re-add donor migrations of SCHED_DEADLINE tasks, so I’ve
dropped the logic that previously disabled this.
* Workaround handling for “lost wakeups” issue (see below)
Issues still to address with the full series:
* The new delayed dequeuing logic added in 6.12-rc1 really
conceptually collides with proxy-execution: As we now have
multiple tasks that aren’t runnable for different reasons
(one may be sleeping, another may be blocked on a mutex) that
are left on the RQ and the rules for how we handle these
un-runnable but enqueued tasks are different for each case.
Right now my workaround is just stop proxying if we hit a
sched_delayed task, but I’d like to have a better solution.
My plan is to treat it similar to sleeping tasks, and do the
same deactivated-owner-queuing (queuing the waiters on the
sched_delayed task). The problem is when a sched_delayed task
gets a wakeup, we won’t hit the logic to do the
blocked-waiters activation, so I’ll need to change that.
Just getting it working won’t address the conceptual
collision, so I’d love any thoughts or feedback on how to
generalize these two new forms of unrunnable-on-the-runqueue
states.
* In testing with the full series (again, for clarity not with
this same-rq proxying series I’m sending out), I hit some
rare cases of what seem to be lost wakeups, where a task was
marked as BO_WAKING, but then ttwu never managed to transition
it to BO_RUNNABLE. This can cause us to get stuck either in the
pick-again loop, or in a idle resched loop. I’ve added handlers
to detect this and to safely do the BO_WAKING -> BO_RUNNABLE
transition along with return migration if needed to avoid this
issue, but this really is pasting over the underlying issue.
This has been difficult to diagnose as by the time the issue
is noticed, the wakeup may have been long in the past and the
tracebuffer overwritten.
* K Prateek Nayak did some testing with an earlier version of the
series and saw ~3-5% regressions in some cases. I’m hoping to
look into this soon to see if we can reduce those further.
* The chain migration functionality needs further iterations and
better validation to ensure it truly maintains the RT/DL load
balancing invariants (despite this being broken in vanilla
upstream with RT_PUSH_IPI currently)
I’d really appreciate any feedback or review thoughts on this
series. I’m trying to keep the chunks small, reviewable and
iteratively testable, but if you have any suggestions on how to
improve the series, I’m all ears.
Credit/Disclaimer:
—--------------------
As mentioned previously, this Proxy Execution series has a long
history:
First described in a paper[1] by Watkins, Straub, Niehaus, then
from patches from Peter Zijlstra, extended with lots of work by
Juri Lelli, Valentin Schneider, and Connor O'Brien. (and thank
you to Steven Rostedt for providing additional details here!)
So again, many thanks to those above, as all the credit for this
series really is due to them - while the mistakes are likely mine.
Thanks so much!
-john
[1] https://static.lwn.net/images/conf/rtlws11/papers/proc/p38.pdf
Cc: Joel Fernandes <joelaf@...gle.com>
Cc: Qais Yousef <qyousef@...alina.io>
Cc: Ingo Molnar <mingo@...hat.com>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Juri Lelli <juri.lelli@...hat.com>
Cc: Vincent Guittot <vincent.guittot@...aro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@....com>
Cc: Valentin Schneider <vschneid@...hat.com>
Cc: Steven Rostedt <rostedt@...dmis.org>
Cc: Ben Segall <bsegall@...gle.com>
Cc: Zimuzo Ezeozue <zezeozue@...gle.com>
Cc: Mel Gorman <mgorman@...e.de>
Cc: Will Deacon <will@...nel.org>
Cc: Waiman Long <longman@...hat.com>
Cc: Boqun Feng <boqun.feng@...il.com>
Cc: "Paul E. McKenney" <paulmck@...nel.org>
Cc: Metin Kaya <Metin.Kaya@....com>
Cc: Xuewen Yan <xuewen.yan94@...il.com>
Cc: K Prateek Nayak <kprateek.nayak@....com>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: Daniel Lezcano <daniel.lezcano@...aro.org>
Cc: kernel-team@...roid.com
John Stultz (4):
sched: Add CONFIG_SCHED_PROXY_EXEC & boot argument to enable/disable
sched: Fix runtime accounting w/ split exec & sched contexts
sched: Fix psi_dequeue for Proxy Execution
sched: Add an initial sketch of the find_proxy_task() function
Peter Zijlstra (2):
locking/mutex: Rework task_struct::blocked_on
sched: Start blocked_on chain processing in find_proxy_task()
Valentin Schneider (1):
sched: Fix proxy/current (push,pull)ability
.../admin-guide/kernel-parameters.txt | 5 +
include/linux/sched.h | 79 ++++-
init/Kconfig | 7 +
init/init_task.c | 1 +
kernel/fork.c | 4 +-
kernel/locking/mutex-debug.c | 9 +-
kernel/locking/mutex.c | 40 ++-
kernel/locking/ww_mutex.h | 24 +-
kernel/sched/core.c | 300 +++++++++++++++++-
kernel/sched/fair.c | 31 +-
kernel/sched/rt.c | 15 +-
kernel/sched/sched.h | 22 +-
kernel/sched/stats.h | 6 +-
13 files changed, 507 insertions(+), 36 deletions(-)
--
2.47.0.199.ga7371fff76-goog
Powered by blists - more mailing lists