linux-kernel - Re: [PATCH v22 2/6] sched/locking: Add blocked_on_state to provide necessary tri-state for proxy return-migration

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20251008112603.GU3419281@noisy.programming.kicks-ass.net>
Date: Wed, 8 Oct 2025 13:26:03 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: John Stultz <jstultz@...gle.com>
Cc: LKML <linux-kernel@...r.kernel.org>,
	Joel Fernandes <joelagnelf@...dia.com>,
	Qais Yousef <qyousef@...alina.io>, Ingo Molnar <mingo@...hat.com>,
	Juri Lelli <juri.lelli@...hat.com>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Valentin Schneider <vschneid@...hat.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Ben Segall <bsegall@...gle.com>,
	Zimuzo Ezeozue <zezeozue@...gle.com>, Mel Gorman <mgorman@...e.de>,
	Will Deacon <will@...nel.org>, Waiman Long <longman@...hat.com>,
	Boqun Feng <boqun.feng@...il.com>,
	"Paul E. McKenney" <paulmck@...nel.org>,
	Metin Kaya <Metin.Kaya@....com>,
	Xuewen Yan <xuewen.yan94@...il.com>,
	K Prateek Nayak <kprateek.nayak@....com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Daniel Lezcano <daniel.lezcano@...aro.org>,
	Suleiman Souhlal <suleiman@...gle.com>,
	kuyo chang <kuyo.chang@...iatek.com>, hupu <hupu.gm@...il.com>,
	kernel-team@...roid.com
Subject: Re: [PATCH v22 2/6] sched/locking: Add blocked_on_state to provide
 necessary tri-state for proxy return-migration

On Fri, Sep 26, 2025 at 03:29:10AM +0000, John Stultz wrote:
> As we add functionality to proxy execution, we may migrate a
> donor task to a runqueue where it can't run due to cpu affinity.
> Thus, we must be careful to ensure we return-migrate the task
> back to a cpu in its cpumask when it becomes unblocked.
> 
> Thus we need more then just a binary concept of the task being
> blocked on a mutex or not.
> 
> So add a blocked_on_state value to the task, that allows the
> task to move through BO_RUNNING -> BO_BLOCKED -> BO_WAKING
> and back to BO_RUNNING. This provides a guard state in
> BO_WAKING so we can know the task is no longer blocked
> but we don't want to run it until we have potentially
> done return migration, back to a usable cpu.
> 
> Signed-off-by: John Stultz <jstultz@...gle.com>
> ---
>  include/linux/sched.h     | 92 +++++++++++++++++++++++++++++++++------
>  init/init_task.c          |  3 ++
>  kernel/fork.c             |  3 ++
>  kernel/locking/mutex.c    | 15 ++++---
>  kernel/locking/ww_mutex.h | 20 ++++-----
>  kernel/sched/core.c       | 45 +++++++++++++++++--
>  kernel/sched/sched.h      |  6 ++-
>  7 files changed, 146 insertions(+), 38 deletions(-)
> 
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index cb4e81d9d9b67..8245940783c77 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -813,6 +813,12 @@ struct kmap_ctrl {
>  #endif
>  };
>  
> +enum blocked_on_state {
> +	BO_RUNNABLE,
> +	BO_BLOCKED,
> +	BO_WAKING,
> +};

I am still struggling with all this.

  RUNNABLE is !p->blocked_on
  BLOCKED is !!p->blocked_on
  WAKING is !!p->blocked_on but you need magical beans

I'm not sure I follow the argument above, and there is a distinct lack
of comments with this enum explaining the states (although there are
some comments scattered across the patch itself).

Last time we talked about this:

  https://lkml.kernel.org/r/20241216165419.GE35539@noisy.programming.kicks-ass.net

I was equally confused; and suggested not having the WAKING state by
simply dequeueing the offending task and letting ttwu() sort it out --
since we know a wakeup will be coming our way.

I'm thinking that suggesting didn't work out somehow, but I'm still not
sure I understand why.

There is this comment:


+               /*
+                * If a ww_mutex hits the die/wound case, it marks the task as
+                * BO_WAKING and calls try_to_wake_up(), so that the mutex
+                * cycle can be broken and we avoid a deadlock.
+                *
+                * However, if at that moment, we are here on the cpu which the
+                * die/wounded task is enqueued, we might loop on the cycle as
+                * BO_WAKING still causes task_is_blocked() to return true
+                * (since we want return migration to occur before we run the
+                * task).
+                *
+                * Unfortunately since we hold the rq lock, it will block
+                * try_to_wake_up from completing and doing the return
+                * migration.
+                *
+                * So when we hit a !BO_BLOCKED task briefly schedule idle
+                * so we release the rq and let the wakeup complete.
+                */
+               if (p->blocked_on_state != BO_BLOCKED)
+                       return proxy_resched_idle(rq);


Which I presume tries to clarify things, but that only had me scratching
my head again. Why would you need task_is_blocked() to affect return
migration?