[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251008112603.GU3419281@noisy.programming.kicks-ass.net>
Date: Wed, 8 Oct 2025 13:26:03 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: John Stultz <jstultz@...gle.com>
Cc: LKML <linux-kernel@...r.kernel.org>,
Joel Fernandes <joelagnelf@...dia.com>,
Qais Yousef <qyousef@...alina.io>, Ingo Molnar <mingo@...hat.com>,
Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Valentin Schneider <vschneid@...hat.com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>,
Zimuzo Ezeozue <zezeozue@...gle.com>, Mel Gorman <mgorman@...e.de>,
Will Deacon <will@...nel.org>, Waiman Long <longman@...hat.com>,
Boqun Feng <boqun.feng@...il.com>,
"Paul E. McKenney" <paulmck@...nel.org>,
Metin Kaya <Metin.Kaya@....com>,
Xuewen Yan <xuewen.yan94@...il.com>,
K Prateek Nayak <kprateek.nayak@....com>,
Thomas Gleixner <tglx@...utronix.de>,
Daniel Lezcano <daniel.lezcano@...aro.org>,
Suleiman Souhlal <suleiman@...gle.com>,
kuyo chang <kuyo.chang@...iatek.com>, hupu <hupu.gm@...il.com>,
kernel-team@...roid.com
Subject: Re: [PATCH v22 2/6] sched/locking: Add blocked_on_state to provide
necessary tri-state for proxy return-migration
On Fri, Sep 26, 2025 at 03:29:10AM +0000, John Stultz wrote:
> As we add functionality to proxy execution, we may migrate a
> donor task to a runqueue where it can't run due to cpu affinity.
> Thus, we must be careful to ensure we return-migrate the task
> back to a cpu in its cpumask when it becomes unblocked.
>
> Thus we need more then just a binary concept of the task being
> blocked on a mutex or not.
>
> So add a blocked_on_state value to the task, that allows the
> task to move through BO_RUNNING -> BO_BLOCKED -> BO_WAKING
> and back to BO_RUNNING. This provides a guard state in
> BO_WAKING so we can know the task is no longer blocked
> but we don't want to run it until we have potentially
> done return migration, back to a usable cpu.
>
> Signed-off-by: John Stultz <jstultz@...gle.com>
> ---
> include/linux/sched.h | 92 +++++++++++++++++++++++++++++++++------
> init/init_task.c | 3 ++
> kernel/fork.c | 3 ++
> kernel/locking/mutex.c | 15 ++++---
> kernel/locking/ww_mutex.h | 20 ++++-----
> kernel/sched/core.c | 45 +++++++++++++++++--
> kernel/sched/sched.h | 6 ++-
> 7 files changed, 146 insertions(+), 38 deletions(-)
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index cb4e81d9d9b67..8245940783c77 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -813,6 +813,12 @@ struct kmap_ctrl {
> #endif
> };
>
> +enum blocked_on_state {
> + BO_RUNNABLE,
> + BO_BLOCKED,
> + BO_WAKING,
> +};
I am still struggling with all this.
RUNNABLE is !p->blocked_on
BLOCKED is !!p->blocked_on
WAKING is !!p->blocked_on but you need magical beans
I'm not sure I follow the argument above, and there is a distinct lack
of comments with this enum explaining the states (although there are
some comments scattered across the patch itself).
Last time we talked about this:
https://lkml.kernel.org/r/20241216165419.GE35539@noisy.programming.kicks-ass.net
I was equally confused; and suggested not having the WAKING state by
simply dequeueing the offending task and letting ttwu() sort it out --
since we know a wakeup will be coming our way.
I'm thinking that suggesting didn't work out somehow, but I'm still not
sure I understand why.
There is this comment:
+ /*
+ * If a ww_mutex hits the die/wound case, it marks the task as
+ * BO_WAKING and calls try_to_wake_up(), so that the mutex
+ * cycle can be broken and we avoid a deadlock.
+ *
+ * However, if at that moment, we are here on the cpu which the
+ * die/wounded task is enqueued, we might loop on the cycle as
+ * BO_WAKING still causes task_is_blocked() to return true
+ * (since we want return migration to occur before we run the
+ * task).
+ *
+ * Unfortunately since we hold the rq lock, it will block
+ * try_to_wake_up from completing and doing the return
+ * migration.
+ *
+ * So when we hit a !BO_BLOCKED task briefly schedule idle
+ * so we release the rq and let the wakeup complete.
+ */
+ if (p->blocked_on_state != BO_BLOCKED)
+ return proxy_resched_idle(rq);
Which I presume tries to clarify things, but that only had me scratching
my head again. Why would you need task_is_blocked() to affect return
migration?
Powered by blists - more mailing lists