lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241125195204.2374458-1-jstultz@google.com>
Date: Mon, 25 Nov 2024 11:51:54 -0800
From: John Stultz <jstultz@...gle.com>
To: LKML <linux-kernel@...r.kernel.org>
Cc: John Stultz <jstultz@...gle.com>, Joel Fernandes <joelaf@...gle.com>, 
	Qais Yousef <qyousef@...alina.io>, Ingo Molnar <mingo@...hat.com>, 
	Peter Zijlstra <peterz@...radead.org>, Juri Lelli <juri.lelli@...hat.com>, 
	Vincent Guittot <vincent.guittot@...aro.org>, Dietmar Eggemann <dietmar.eggemann@....com>, 
	Valentin Schneider <vschneid@...hat.com>, Steven Rostedt <rostedt@...dmis.org>, 
	Ben Segall <bsegall@...gle.com>, Zimuzo Ezeozue <zezeozue@...gle.com>, Mel Gorman <mgorman@...e.de>, 
	Will Deacon <will@...nel.org>, Waiman Long <longman@...hat.com>, Boqun Feng <boqun.feng@...il.com>, 
	"Paul E. McKenney" <paulmck@...nel.org>, Metin Kaya <Metin.Kaya@....com>, 
	Xuewen Yan <xuewen.yan94@...il.com>, K Prateek Nayak <kprateek.nayak@....com>, 
	Thomas Gleixner <tglx@...utronix.de>, Daniel Lezcano <daniel.lezcano@...aro.org>, kernel-team@...roid.com
Subject: [RFC][PATCH v14 0/7] Single CPU Proxy Execution (v14)

Hey All,

After sending out the last revision, I got some feedback that
pointed out I hadn’t done much testing with the now upstream
enabled PREEMPT_RT option, resulting in a few build issues. So I
wanted to send out the next iteration Proxy Execution - an
approach for a generalized form of priority inheritance. 

In this series, I’m only submitting the logic to support Proxy
Execution as both a build and runtime option, the mutex
blocked_on rework, some small fixes for assumptions that proxy
changes, and the initial logic to run lock owners in place of
the waiting task on the same cpu.

With v14 of this set, most of the changes are just build fixups
related to PREEMPT_RT. With PREEMPT_RT the abstraction around
mutexes means accessing the mutex owner runs into some trouble
with the rt_mutex as the underlying structure changes, so I need
to do some further work abstracting how I access that to get it
working. So for now, I’ve just made SCHED_PROXY_EXEC option
exclusive to PREEMPT_RT.

I’ve also continued working on the rest of the series, which you
can find here:
  https://github.com/johnstultz-work/linux-dev/commits/proxy-exec-v14-6.13-rc1/
  https://github.com/johnstultz-work/linux-dev.git proxy-exec-v14-6.13-rc1

New changes in the full series include:
* Rework of sleeping_owner handling so that we properly deal
  with delayed-dequeued (sched_delayed) tasks (also removes now
  unused proxy_deactivate() logic)
* Improving edge cases in ttwu where we wouldn’t set the task
  as BO_RUNNABLE
* Making sure we call block_task() last in proxy_enqueue_on_owner
  and not touch it again to avoid races where it might be
  activated on another cpu
* Make sure we always activate blocked_entities when we exit
  from ttwu
* Fix to enqueue the last task in the chain (p) on the blocked
  owner instead of donor, so that we preserve the chain
  structure, so mid-chain wakeups propagate properly

Issues still to address with the full series:
* While I think I’ve now properly handled delayed dequeued tasks,
  I’d still appreciate any input on ways of better generalizing
  these multiple approaches to having un-runnable blocked tasks
  remaining on the runqueue.
* Even with some of the fixes in this version (and again, for
  clarity not with this same-rq proxying series I’m sending out
  here), I still have to include some workarounds to avoid hitting
  some rare cases of what seem to be lost wakeups, where a task
  was marked as BO_WAKING, but then ttwu never managed to
  transition it to BO_RUNNABLE. The workarounds handle doing the
  return migration from find_proxy_task() but I still feel that
  those fixups shouldn’t be necessary, so I suspect the ttwu
  logic has a race somewhere I’m missing.
* K Prateek Nayak did some testing about a year ago with an
  earlier version of the series and saw ~3-5% regressions in
  some cases. I’m hoping to look into this soon to see if we
  can reduce those further.
* The chain migration functionality needs further iterations
  and better validation to ensure it truly maintains the RT/DL
  load balancing invariants (despite this being broken in
  vanilla upstream with RT_PUSH_IPI currently)

I’d really appreciate any feedback or review thoughts on this
series! I’m trying to keep the chunks small, reviewable and
iteratively testable, but if you have any suggestions on how to
improve the series, I’m all ears.

Credit/Disclaimer:
—--------------------
As mentioned previously, this Proxy Execution series has a long
history: 

First described in a paper[1] by Watkins, Straub, Niehaus, then
from patches from Peter Zijlstra, extended with lots of work by
Juri Lelli, Valentin Schneider, and Connor O'Brien. (and thank
you to Steven Rostedt for providing additional details here!)

So again, many thanks to those above, as all the credit for this
series really is due to them - while the mistakes are likely
mine.

Thanks so much!
-john

[1] https://static.lwn.net/images/conf/rtlws11/papers/proc/p38.pdf


Cc: Joel Fernandes <joelaf@...gle.com>
Cc: Qais Yousef <qyousef@...alina.io>
Cc: Ingo Molnar <mingo@...hat.com>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Juri Lelli <juri.lelli@...hat.com>
Cc: Vincent Guittot <vincent.guittot@...aro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@....com>
Cc: Valentin Schneider <vschneid@...hat.com>
Cc: Steven Rostedt <rostedt@...dmis.org>
Cc: Ben Segall <bsegall@...gle.com>
Cc: Zimuzo Ezeozue <zezeozue@...gle.com>
Cc: Mel Gorman <mgorman@...e.de>
Cc: Will Deacon <will@...nel.org>
Cc: Waiman Long <longman@...hat.com>
Cc: Boqun Feng <boqun.feng@...il.com>
Cc: "Paul E. McKenney" <paulmck@...nel.org>
Cc: Metin Kaya <Metin.Kaya@....com>
Cc: Xuewen Yan <xuewen.yan94@...il.com>
Cc: K Prateek Nayak <kprateek.nayak@....com>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: Daniel Lezcano <daniel.lezcano@...aro.org>
Cc: kernel-team@...roid.com

John Stultz (4):
  sched: Add CONFIG_SCHED_PROXY_EXEC & boot argument to enable/disable
  sched: Fix runtime accounting w/ split exec & sched contexts
  sched: Fix psi_dequeue for Proxy Execution
  sched: Add an initial sketch of the find_proxy_task() function

Peter Zijlstra (2):
  locking/mutex: Rework task_struct::blocked_on
  sched: Start blocked_on chain processing in find_proxy_task()

Valentin Schneider (1):
  sched: Fix proxy/current (push,pull)ability

 .../admin-guide/kernel-parameters.txt         |   5 +
 include/linux/sched.h                         |  79 ++++-
 init/Kconfig                                  |   9 +
 init/init_task.c                              |   1 +
 kernel/fork.c                                 |   4 +-
 kernel/locking/mutex-debug.c                  |   9 +-
 kernel/locking/mutex.c                        |  40 ++-
 kernel/locking/mutex.h                        |   3 +-
 kernel/locking/ww_mutex.h                     |  24 +-
 kernel/sched/core.c                           | 300 +++++++++++++++++-
 kernel/sched/fair.c                           |  31 +-
 kernel/sched/rt.c                             |  15 +-
 kernel/sched/sched.h                          |  22 +-
 kernel/sched/stats.h                          |   6 +-
 14 files changed, 511 insertions(+), 37 deletions(-)

-- 
2.47.0.371.ga323438b13-goog


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ