[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55d707b1f4e1478ea19b022e60805b98@honor.com>
Date: Tue, 19 Aug 2025 08:47:38 +0000
From: liuwenfang <liuwenfang@...or.com>
To: 'Peter Zijlstra' <peterz@...radead.org>
CC: 'Tejun Heo' <tj@...nel.org>, 'David Vernet' <void@...ifault.com>, "'Andrea
Righi'" <arighi@...dia.com>, 'Changwoo Min' <changwoo@...lia.com>, "'Ingo
Molnar'" <mingo@...hat.com>, 'Juri Lelli' <juri.lelli@...hat.com>, "'Vincent
Guittot'" <vincent.guittot@...aro.org>, 'Dietmar Eggemann'
<dietmar.eggemann@....com>, 'Steven Rostedt' <rostedt@...dmis.org>, "'Ben
Segall'" <bsegall@...gle.com>, 'Mel Gorman' <mgorman@...e.de>, "'Valentin
Schneider'" <vschneid@...hat.com>, "'linux-kernel@...r.kernel.org'"
<linux-kernel@...r.kernel.org>
Subject: 回复: [PATCH v4 2/3] sched_ext: Fix cpu_released while RT task and SCX task are scheduled concurrently
Hello,
> Could you please not thread your new patches onto the old thread? That makes
> them near impossible to find.
I will try to fix it later.
>
> On Tue, Aug 19, 2025 at 06:55:38AM +0000, liuwenfang wrote:
> > Supposed RT task(RT1) is running on CPU0 and RT task(RT2) is awakened
> > on CPU1,
> > RT1 becomes sleep and SCX task(SCX1) will be dispatched to CPU0, RT2
> > will be placed on CPU0:
> >
> > CPU0(schedule)
> CPU1(try_to_wake_up)
> > set_current_state(TASK_INTERRUPTIBLE) try_to_wake_up #
> RT2
> > __schedule
> select_task_rq # CPU0 is selected
> > LOCK rq(0)->lock # lock CPU0 rq ttwu_queue
> > deactivate_task # RT1 LOCK
> rq(0)->lock # busy waiting
> > pick_next_task # no more RT tasks on rq |
> > prev_balance |
> > balance_scx |
> > balance_one |
> > rq->scx.cpu_released = false; |
> > consume_global_dsq |
> > consume_dispatch_q |
> > consume_remote_task |
> > UNLOCK rq(0)->lock V
> > LOCK
> rq(0)->lock # succ
> > deactivate_task # SCX1
> ttwu_do_activate
> > LOCK rq(0)->lock # busy waiting activate_task
> # RT2 equeued
> > |
> UNLOCK rq(0)->lock
> > V
> > LOCK rq(0)->lock # succ
> > activate_task # SCX1
> > pick_task # RT2 is picked
> > put_prev_set_next_task # prev is RT1, next is RT2,
> > rq->scx.cpu_released = false; UNLOCK rq(0)->lock
> >
> > At last, RT2 will be running on CPU0 with rq->scx.cpu_released being
> > false, which would lead the BPF scheduler to make decisions improperly.
> >
> > So, check the sched class in __put_prev_set_next_scx() to fix the
> > value of
> > rq->scx.cpu_released.
>
> Oh gawd, this is terrible.
>
> Why would you start the pick in balance and then cry when you fail the pick in
> pick ?!?
>
> This is also the reason you need that weird CLASS_EXT exception in
> prev_balance(), isn't it?
Yeah, you are right, because there is task migration in our exception process.
>
> You're now asking for a 3rd call out to do something like:
>
> ->balance() -- ready a task for pick
We must clarify that the target SCX task is currently located in the global queue, and it's CPU selection maybe CPU2,
when the current CPU0 will be idle, this SCX task should be migrated to CPU0.
> ->pick() -- picks a random other task
The rq lock of CPU0 will be released during task migration, and higher priority task will be placed on CPU0 rq,
So the CPU0 will not always pick the target SCX task timely.
> ->put_prev() -- oops, our task didn't get picked, stick it back
The higher priority task may cost a long time on CPU0, so we need to get the SCX task back for its low latency demand.
>
> Which is bloody ludicrous. So no. We're not doing this.
>
> Why can't pick DTRT ?
This's why the CPU0 cannot pick one SCX task directly which task_cpu() is not CPU0.
Powered by blists - more mailing lists