[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aFmwHzO2AKFXO_YS@slm.duckdns.org>
Date: Mon, 23 Jun 2025 09:50:55 -1000
From: 'Tejun Heo' <tj@...nel.org>
To: liuwenfang <liuwenfang@...or.com>
Cc: 'David Vernet' <void@...ifault.com>, 'Andrea Righi' <arighi@...dia.com>,
'Changwoo Min' <changwoo@...lia.com>,
'Ingo Molnar' <mingo@...hat.com>,
'Peter Zijlstra' <peterz@...radead.org>,
'Juri Lelli' <juri.lelli@...hat.com>,
'Vincent Guittot' <vincent.guittot@...aro.org>,
'Dietmar Eggemann' <dietmar.eggemann@....com>,
'Steven Rostedt' <rostedt@...dmis.org>,
'Ben Segall' <bsegall@...gle.com>, 'Mel Gorman' <mgorman@...e.de>,
'Valentin Schneider' <vschneid@...hat.com>,
"'linux-kernel@...r.kernel.org'" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] sched_ext: Fix cpu_released while RT task and SCX task
are scheduled concurrently
Hello,
On Sat, Jun 21, 2025 at 04:09:55AM +0000, liuwenfang wrote:
> Supposed RT task(rt1) is running on one CPU with its rq->scx.cpu_released
> set to true, if the rt1 becomes sleeping, then the scheduler will balance
> the remote SCX task(scx1) because there is no other RT task on its rq,
> and rq->scx.cpu_released is false. While one RT task(rt2) is placed on
> this rq(maybe rt2 wakeup or migration occurs) before the scx1 is enqueued,
> then the scheduler will pick rt2. At last, rt2 will be running on this cpu
> with rq->scx.cpu_released being false!
> The main reason is that consume_remote_task() will unlock rq lock.
This is rather difficult to follow. Can you please break this down to a
table? People often use a format like the following:
CPU X CPU Y
A does something
B does something else
...
...
Boom
> @@ -2470,6 +2471,11 @@ static inline void put_prev_set_next_task(struct rq *rq,
>
> prev->sched_class->put_prev_task(rq, prev, next);
> next->sched_class->set_next_task(rq, next, true);
> +
> +#ifdef CONFIG_SCHED_CLASS_EXT
> + if (scx_enabled())
> + switch_class(rq, next);
> +#endif
You're right that there is a race condition around this and I can't see a
way to solve this in SCX proper as there's no way for balance() to tell
whether a higher priority sched class has queued something while balance()
dropped the rq lock for migration, so adding a hook to
put_prev_set_next_task() seems like a reasoanble solution. However, can you
please do the followings?
- Improve the description so that the race condition is clearly
understandable and explain why the extra hook in put_prev_set_next_task()
is necessary.
- Rename switch_class() to something which fits the new location better -
maybe scx_put_prev_set_next_task().
- If the function is called from put_prev_set_next_task(), it doesn't need
to be called from put_prev_task_scx(). Drop that call.
Thanks.
--
tejun
Powered by blists - more mailing lists