lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240709203213.799070-1-jstultz@google.com>
Date: Tue,  9 Jul 2024 13:31:43 -0700
From: John Stultz <jstultz@...gle.com>
To: LKML <linux-kernel@...r.kernel.org>
Cc: John Stultz <jstultz@...gle.com>, Joel Fernandes <joelaf@...gle.com>, 
	Qais Yousef <qyousef@...alina.io>, Ingo Molnar <mingo@...hat.com>, 
	Peter Zijlstra <peterz@...radead.org>, Juri Lelli <juri.lelli@...hat.com>, 
	Vincent Guittot <vincent.guittot@...aro.org>, Dietmar Eggemann <dietmar.eggemann@....com>, 
	Valentin Schneider <vschneid@...hat.com>, Steven Rostedt <rostedt@...dmis.org>, 
	Ben Segall <bsegall@...gle.com>, Zimuzo Ezeozue <zezeozue@...gle.com>, 
	Youssef Esmat <youssefesmat@...gle.com>, Mel Gorman <mgorman@...e.de>, Will Deacon <will@...nel.org>, 
	Waiman Long <longman@...hat.com>, Boqun Feng <boqun.feng@...il.com>, 
	"Paul E. McKenney" <paulmck@...nel.org>, Metin Kaya <Metin.Kaya@....com>, 
	Xuewen Yan <xuewen.yan94@...il.com>, K Prateek Nayak <kprateek.nayak@....com>, 
	Thomas Gleixner <tglx@...utronix.de>, Daniel Lezcano <daniel.lezcano@...aro.org>, kernel-team@...roid.com
Subject: [PATCH v11 0/7] Preparatory changes for Proxy Execution v11

Hey All,

I wanted to send out v11 of the preparatory patches for Proxy
Execution - an approach for a generalized form of priority
inheritance. Here again, I’m only submitting the early /
preparatory changes for review, in the hope that we can move
these more straightforward patches along and then iteratively
move through the more interesting patches in the Proxy Execution
series. That said, I’ve not gotten a ton of feedback with this
approach, so I’m open to other suggestions.

There have been some changes to the preparatory patches in v11:
* Qais Yousef suggested a few other spots where the
  move_queued_task_locked() helper could be used.
* Simplified the task_is_pushable() helper to return a bool as
  suggested by Metin Kaya and others. It will later be a
  tri-state return, but that can wait for later in the series
  when it is actually used.
* A few spots of re-arranging logic to reduce indentation and
  simplify things, suggested by Qais and Metin
* Metin pointed out some spots in the split scheduler and
  execution contexts patch where variables could be more clearly
  named.

Many thanks to Metin and Qais for their detailed feedback here!

I’ve also continued working on the rest of the series, which you
can find here:
 https://github.com/johnstultz-work/linux-dev/commits/proxy-exec-v11-6.10-rc7
 https://github.com/johnstultz-work/linux-dev.git proxy-exec-v11-6.10-rc7

New changes in the full series include:
* Got rid of recursion in activate_blocked_waiter logic
* Added more detail to new traceevents as well as additional
  traceevents for validating behavior
* Fixes for edge case where wake_cpu used for return migration
  ended up outside the affinity mask
* Fix for case where we weren’t preserving need_resched when
  find_proxy_task() returns the idle task
* Lots of small detail cleanups suggested by Metin

Issues still to address with the full series:
* K Prateek Nayak did some testing with an earlier version of
  the series and saw ~3-5% regressions in some cases. I’m hoping
  to look into this soon to see if we can reduce those further.
* The chain migration functionality needs further iterations and
  better validation to ensure it truly maintains the RT/DL load
  balancing invariants (despite this being broken in vanilla
  upstream with RT_PUSH_IPI currently)
* At OSPM, Juri Lelli and the (very very sadly) late Daniel
  Bristot de Oliveira raised the point that Proxy Exec may not
  actually be generalizable for SCHED_DEADLINE tasks, as one
  cannot always correctly donate the resources of the waiter to
  an owner on a different cpu. If one was to reverse the
  proxy-migration direction, migrating the owner to the waiter
  cpu, this would preserve the SCHED_DEADLINE bandwidth
  calculations, but would break down if the owner's cpu affinity
  disallowed it. To my understanding this constraint seems to
  make most forms of priority inheritance infeasible with
  SCHED_DEADLINE, but I’ll have to leave that to the
  folks/academics who know it well. After talking with Juri, my
  current plan is just to special case find_proxy_task() to not
  proxy with SCHED_DEADLINE (falling back to the current behavior
  where we deactivate the waiting task). But SCHED_NORMAL waiter
  tasks would still be able to benefit from Proxy Exec.
* Also at OSPM, Thomas Gleixner mentioned we might consider
  including Proxy Exec in the PREEMPT_RT patch series, however
  for this to be useful I need to take a stab at deprecating
  rt_mutexes for proxy mutexes, as everything is an rt_mutex
  with PREEMPT_RT.


Credit/Disclaimer:
--------------------
As mentioned previously, this Proxy Execution series has a long
history: 

First described in a paper[1] by Watkins, Straub, Niehaus, then
from patches from Peter Zijlstra, extended with lots of work by
Juri Lelli, Valentin Schneider, and Connor O'Brien. (and thank
you to Steven Rostedt for providing additional details here!)

So again, many thanks to those above, as all the credit for this
series really is due to them - while the mistakes are likely
mine.


As always, feedback and review would be greatly appreciated!

Thanks so much!
-john

[1] https://static.lwn.net/images/conf/rtlws11/papers/proc/p38.pdf

Cc: Joel Fernandes <joelaf@...gle.com>
Cc: Qais Yousef <qyousef@...alina.io>
Cc: Ingo Molnar <mingo@...hat.com>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Juri Lelli <juri.lelli@...hat.com>
Cc: Vincent Guittot <vincent.guittot@...aro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@....com>
Cc: Valentin Schneider <vschneid@...hat.com>
Cc: Steven Rostedt <rostedt@...dmis.org>
Cc: Ben Segall <bsegall@...gle.com>
Cc: Zimuzo Ezeozue <zezeozue@...gle.com>
Cc: Youssef Esmat <youssefesmat@...gle.com>
Cc: Mel Gorman <mgorman@...e.de>
Cc: Will Deacon <will@...nel.org>
Cc: Waiman Long <longman@...hat.com>
Cc: Boqun Feng <boqun.feng@...il.com>
Cc: "Paul E. McKenney" <paulmck@...nel.org>
Cc: Metin Kaya <Metin.Kaya@....com>
Cc: Xuewen Yan <xuewen.yan94@...il.com>
Cc: K Prateek Nayak <kprateek.nayak@....com>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: Daniel Lezcano <daniel.lezcano@...aro.org>
Cc: kernel-team@...roid.com

Connor O'Brien (2):
  sched: Add move_queued_task_locked helper
  sched: Consolidate pick_*_task to task_is_pushable helper

John Stultz (1):
  sched: Split out __schedule() deactivate task logic into a helper

Juri Lelli (2):
  locking/mutex: Make mutex::wait_lock irq safe
  locking/mutex: Expose __mutex_owner()

Peter Zijlstra (2):
  locking/mutex: Remove wakeups from under mutex::wait_lock
  sched: Split scheduler and execution contexts

 kernel/locking/mutex.c       |  60 +++++++---------
 kernel/locking/mutex.h       |  27 ++++++++
 kernel/locking/rtmutex.c     |  30 +++++---
 kernel/locking/rwbase_rt.c   |   8 ++-
 kernel/locking/rwsem.c       |   4 +-
 kernel/locking/spinlock_rt.c |   3 +-
 kernel/locking/ww_mutex.h    |  49 +++++++------
 kernel/sched/core.c          | 130 ++++++++++++++++++++---------------
 kernel/sched/deadline.c      |  57 +++++++--------
 kernel/sched/fair.c          |  32 ++++-----
 kernel/sched/rt.c            |  67 ++++++++----------
 kernel/sched/sched.h         |  48 ++++++++++++-
 12 files changed, 295 insertions(+), 220 deletions(-)

-- 
2.45.2.993.g49e7a77208-goog


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ