[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e52847fc-8aae-4fd7-90e4-494be02e214b@linux.ibm.com>
Date: Wed, 9 Apr 2025 11:31:31 +0530
From: Madadi Vineeth Reddy <vineethr@...ux.ibm.com>
To: hupu <hupu.gm@...il.com>
Cc: jstultz@...gle.com, linux-kernel@...r.kernel.org, juri.lelli@...hat.com,
peterz@...radead.org, vschneid@...hat.com, mingo@...hat.com,
vincent.guittot@...aro.org, dietmar.eggemann@....com,
rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
hupu@...nssion.com, Madadi Vineeth Reddy <vineethr@...ux.ibm.com>
Subject: Re: [RFC 1/1] sched: Remove unreliable wake_cpu check in
proxy_needs_return
On 07/04/25 19:16, hupu wrote:
> The (p->wake_cpu != cpu_of(rq)) check in proxy_needs_return() is unsafe
> during early wakeup phase. When called via ttwu_runnable() path:
>
> |-- try_to_wake_up
> |-- ttwu_runnable
> |-- proxy_needs_return //we are here
> |-- select_task_rq
> |-- set_task_cpu //set p->wake_cpu here
> |-- ttwu_queue
>
> The p->wake_cpu at this point reflects the CPU where donor last ran before
> blocking, not the target migration CPU. During blocking period:
> 1. CPU affinity may have been changed by other threads
> 2. Proxy migrations might have altered the effective wake_cpu
> 3. set_task_cpu() hasn't updated wake_cpu yet in this code path
>
> This makes the wake_cpu vs current CPU comparison meaningless and potentially
> dangerous. Rely on find_proxy_task()'s later migration logic to handle CPU
> placement based on up-to-date affinity and scheduler state.
>
> Signed-off-by: hupu <hupu.gm@...il.com>
> ---
> kernel/sched/core.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 3c4ef4c71cfd..ca4ca739eb85 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -4047,7 +4047,7 @@ static inline bool proxy_needs_return(struct rq *rq, struct task_struct *p)
>
> raw_spin_lock(&p->blocked_lock);
> if (__get_task_blocked_on(p) && p->blocked_on_state & BO_NEEDS_RETURN) {
> - if (!task_current(rq, p) && (p->wake_cpu != cpu_of(rq))) {
> + if (!task_current(rq, p)) {
> if (task_current_donor(rq, p)) {
> put_prev_task(rq, p);
> rq_set_donor(rq, rq->idle);
Which tree is this change based on? I don't see `proxy_needs_return` in tip/sched/core.
Thanks,
Madadi Vineeth Reddy
Powered by blists - more mailing lists