[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251030001857.681432-1-jstultz@google.com>
Date: Thu, 30 Oct 2025 00:18:41 +0000
From: John Stultz <jstultz@...gle.com>
To: LKML <linux-kernel@...r.kernel.org>
Cc: John Stultz <jstultz@...gle.com>, Joel Fernandes <joelagnelf@...dia.com>, 
	Qais Yousef <qyousef@...alina.io>, Ingo Molnar <mingo@...hat.com>, 
	Peter Zijlstra <peterz@...radead.org>, Juri Lelli <juri.lelli@...hat.com>, 
	Vincent Guittot <vincent.guittot@...aro.org>, Dietmar Eggemann <dietmar.eggemann@....com>, 
	Valentin Schneider <vschneid@...hat.com>, Steven Rostedt <rostedt@...dmis.org>, 
	Ben Segall <bsegall@...gle.com>, Zimuzo Ezeozue <zezeozue@...gle.com>, Mel Gorman <mgorman@...e.de>, 
	Will Deacon <will@...nel.org>, Waiman Long <longman@...hat.com>, Boqun Feng <boqun.feng@...il.com>, 
	"Paul E. McKenney" <paulmck@...nel.org>, Metin Kaya <Metin.Kaya@....com>, 
	Xuewen Yan <xuewen.yan94@...il.com>, K Prateek Nayak <kprateek.nayak@....com>, 
	Thomas Gleixner <tglx@...utronix.de>, Daniel Lezcano <daniel.lezcano@...aro.org>, 
	Suleiman Souhlal <suleiman@...gle.com>, kuyo chang <kuyo.chang@...iatek.com>, hupu <hupu.gm@...il.com>, 
	kernel-team@...roid.com
Subject: [PATCH v23 0/9] Donor Migration for Proxy Execution (v23)
Hey All,
Just another iteration on the next chunk of the proxy-exec
series: Donor Migration
This is just the next step for Proxy Execution, to allow us to
migrate blocked donors across runqueues to boost remote lock
owners.
As always, I’m trying to submit this larger work in smallish
digestible pieces, so in this portion of the series, I’m only
submitting for review and consideration the logic that allows us
to do donor(blocked waiter) migration, which requires some
additional changes to locking and extra state tracking to ensure
we don’t accidentally run a migrated donor on a cpu it isn’t
affined to, as well as some extra handling to deal with balance
callback state that needs to be reset when we decide to pick a
different task after doing donor migration.
Peter provided some very helpful review and feedback on the last
iteration, and I’ve tried to address his concerns and suggestions
here. However, the rework ended up being fairly significant (and
I think I stepped on just about every rake possible in the
process :P), so stabilizing the changes took much longer than I
had hoped for. I suspect Peter will have further suggestions
for changes.
New in this iteration:
* Folded the blocked_on_state into the blocked_on ptr, by
  introducing a special PROXY_WAKING value. This slight
  “compression” of state tracking had some subtle implications,
  especially in the find_proxy_task() chain walking, but I think
  I’ve got it worked out.
* Split the donor migration patch into a few smaller patches,
  handling manual return migration from find_proxy_task() first,
  and then adding the return migration logic in try_to_wakeup()
  in a separate patch, as requested by Peter. This also
  uncovered quite a few subtleties and required majorly
  reworking the proxy_force_return() logic.
* Pulled out balance_callback WARN_ON into its own helper
  function so we can check it at the top of the pick_again loop,
  as Peter suggested
I’d love to get further feedback on any place where these
patches are confusing, or could use additional clarifications.
In the full series, I don’t have much new as this rework took
up much of my time. But I’d still appreciate any testing or
comments that folks have:
Also you can find the full proxy-exec series here:
  https://github.com/johnstultz-work/linux-dev/commits/proxy-exec-v23-6.18-rc3
  https://github.com/johnstultz-work/linux-dev.git proxy-exec-v23-6.18-rc3
Issues still to address with the full series:
* Continue working to get sched_ext to be ok with
  proxy-execution enabled.
* I’ve reproduced the performance regression K Prateek Nayak
  found with the full series. I’m hoping to work to
  understand and narrow the issue down soon.
* Polish Suleiman’s rwsem patches some, as the PROXY_WAKING
  rework added some atomicity complications to traversing and
  locking the blocked_on structure and my initial fixups aren’t
  super elegant.
* The chain migration functionality needs further iterations and
  better validation to ensure it truly maintains the RT/DL load
  balancing invariants (despite this being broken in vanilla
  upstream with RT_PUSH_IPI currently)
Future work:
* Expand to more locking primitives: Figuring out pi-futexes
  would be good, using proxy for Binder PI is something else
  we’re exploring.
* Eventually: Work to replace rt_mutexes and get things happy
  with PREEMPT_RT
I’d really appreciate any feedback or review thoughts on the
full series as well. I’m trying to keep the chunks small,
reviewable and iteratively testable, but if you have any
suggestions on how to improve the larger series, I’m all ears.
Credit/Disclaimer:
—--------------------
As always, this Proxy Execution series has a long history with
lots of developers that deserve credit: 
First described in a paper[1] by Watkins, Straub, Niehaus, then
from patches from Peter Zijlstra, extended with lots of work by
Juri Lelli, Valentin Schneider, and Connor O'Brien. (and thank
you to Steven Rostedt for providing additional details here!).
Thanks also to Joel Fernandes, Dietmar Eggemann, Metin Kaya,
K Prateek Nayak and Suleiman Souhlal for their substantial
review, suggestion, and patch contributions.
So again, many thanks to those above, as all the credit for
this series really is due to them - while the mistakes are
surely mine.
Thanks so much!
-john
[1] https://static.lwn.net/images/conf/rtlws11/papers/proc/p38.pdf
Cc: Joel Fernandes <joelagnelf@...dia.com>
Cc: Qais Yousef <qyousef@...alina.io>   
Cc: Ingo Molnar <mingo@...hat.com>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Juri Lelli <juri.lelli@...hat.com>
Cc: Vincent Guittot <vincent.guittot@...aro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@....com>
Cc: Valentin Schneider <vschneid@...hat.com>
Cc: Steven Rostedt <rostedt@...dmis.org>
Cc: Ben Segall <bsegall@...gle.com>
Cc: Zimuzo Ezeozue <zezeozue@...gle.com>
Cc: Mel Gorman <mgorman@...e.de>
Cc: Will Deacon <will@...nel.org>
Cc: Waiman Long <longman@...hat.com>
Cc: Boqun Feng <boqun.feng@...il.com>
Cc: "Paul E. McKenney" <paulmck@...nel.org>
Cc: Metin Kaya <Metin.Kaya@....com>
Cc: Xuewen Yan <xuewen.yan94@...il.com>
Cc: K Prateek Nayak <kprateek.nayak@....com>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: Daniel Lezcano <daniel.lezcano@...aro.org>
Cc: Suleiman Souhlal <suleiman@...gle.com>
Cc: kuyo chang <kuyo.chang@...iatek.com>
Cc: hupu <hupu.gm@...il.com>
Cc: kernel-team@...roid.com
John Stultz (8):
  locking: Add task::blocked_lock to serialize blocked_on state
  sched: Fix modifying donor->blocked on without proper locking
  sched/locking: Add special p->blocked_on==PROXY_WAKING value for proxy
    return-migration
  sched: Add assert_balance_callbacks_empty helper
  sched: Add logic to zap balance callbacks if we pick again
  sched: Handle blocked-waiter migration (and return migration)
  sched: Have try_to_wake_up() handle return-migration for PROXY_WAKING
    case
  sched: Migrate whole chain in proxy_migrate_task()
Peter Zijlstra (1):
  sched: Add blocked_donor link to task for smarter mutex handoffs
 include/linux/sched.h        |  95 ++++++---
 init/init_task.c             |   5 +
 kernel/fork.c                |   5 +
 kernel/locking/mutex-debug.c |   4 +-
 kernel/locking/mutex.c       |  82 ++++++--
 kernel/locking/mutex.h       |   6 +
 kernel/locking/ww_mutex.h    |  16 +-
 kernel/sched/core.c          | 372 +++++++++++++++++++++++++++++++++--
 kernel/sched/sched.h         |  11 +-
 9 files changed, 519 insertions(+), 77 deletions(-)
-- 
2.51.1.930.gacf6e81ea2-goog
Powered by blists - more mailing lists
 
