[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <86b2ef1b47f25c1ae9eb8006f01d1bba2d7a5421.camel@codethink.co.uk>
Date: Tue, 10 Dec 2024 17:08:16 +0100
From: Marcel Ziswiler <marcel.ziswiler@...ethink.co.uk>
To: Peter Zijlstra <peterz@...radead.org>, Vineeth Remanan Pillai
<vineeth@...byteword.org>
Cc: Joel Fernandes <joel@...lfernandes.org>, Ilya Maximets
<i.maximets@....org>, LKML <linux-kernel@...r.kernel.org>, Ingo Molnar
<mingo@...hat.com>, Juri Lelli <juri.lelli@...hat.com>, Vincent Guittot
<vincent.guittot@...aro.org>, vineethrp@...gle.com, shraash@...gle.com
Subject: Re: [v6.12] WARNING: at kernel/sched/deadline.c:1995
enqueue_dl_entity (task blocked for more than 28262 seconds)
On Mon, 2024-12-09 at 13:56 +0100, Peter Zijlstra wrote:
> On Mon, Dec 09, 2024 at 07:29:52AM -0500, Vineeth Remanan Pillai wrote:
> > On Mon, Dec 9, 2024 at 5:55 AM Peter Zijlstra <peterz@...radead.org> wrote:
> > >
> > > On Fri, Dec 06, 2024 at 11:57:30AM -0500, Vineeth Remanan Pillai wrote:
> > >
> > > > I was able to reproduce this WARN_ON couple of days back with
> > > > syzkaller. dlserver's dl_se gets enqueued during a update_curr while
> > > > the dlserver is stopped. And subsequent dlserver start will cause a
> > > > double enqueue.
> > >
> > > Right, I spotted that hole late last week. There is this thread:
> > >
> > > https://lore.kernel.org/all/20241209094941.GF21636@noisy.programming.kicks-ass.net/T/#u
> > >
> > > Where I just added this thunk:
> > >
> > > @@ -1674,6 +1679,12 @@ void dl_server_start(struct sched_dl_entity *dl_se)
> > >
> > > void dl_server_stop(struct sched_dl_entity *dl_se)
> > > {
> > > + if (current->dl_server == dl_se) {
> > > + struct rq *rq = rq_of_dl_se(dl_se);
> > > + trace_printk("stop fair server %d\n", cpu_of(rq));
> > > + current->dl_server = NULL;
> > > + }
> > > +
> > > if (!dl_se->dl_runtime)
> > > return;
> > >
> > > Which was my attempt at plugging said hole. But since I do not have
> > > means of reproduction, I'm not at all sure it is sufficient :/
> > >
> > I think I was able to get to the root cause last week. So the issue
> > seems to be that dlserver is stopped in the pick_task path of dlserver
> > itself when the sched_delayed is set:
> > __pick_next_task
> > => pick_task_dl -> server_pick_task
> > => pick_task_fair
> > => pick_next_entity (if (sched_delayed))
> > => dequeue_entities
> > => dl_server_stop
>
> Ooh, that's where it happens.
>
> So the scenario I had in mind was that we were doing something like:
>
> current->state = TASK_INTERRUPTIBLE();
> schedule();
> deactivate_task()
> dl_stop_server();
> pick_next_task()
> pick_next_task_fair()
> sched_balance_newidle()
> rq_unlock(this_rq)
>
> at which point another CPU can take our RQ-lock and do:
>
> try_to_wake_up()
> ttwu_queue()
> rq_lock()
> ...
> activate_task()
> dl_server_start()
> wakeup_preempt() := check_preempt_wakeup_fair()
> update_curr()
> update_curr_task()
> if (current->dl_server)
> dl_server_update()
> enqueue_dl_entity()
>
>
> Which then also goes *bang*. The above can't happen if we clear
> current->dl_server in dl_stop_server().
>
> I was worried that might not be it, bcause Marcel had biscected it to
> the delayed stuff, but I'd not managed to reach the pick site yet :/
>
> > Now server_pick_task returns NULL and then we set dl_yielded and call
> > update_curr_dl_se. But dl_se is already dequeued and now the code is
> > confused and it does all sorts of things including setting a timer to
> > enqueue it back. This ultimately leads to double enqueue when dlserver
> > is started next time(based on timing of dl_server_start)
> >
> > I think we should not call update_curr_dl_se when the dlserver is
> > dequeued. Based on this I have a small patch and it seems to solve the
> > issue:
> >
> > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> > index d9d5a702f1a6..a9f3f020e421 100644
> > --- a/kernel/sched/deadline.c
> > +++ b/kernel/sched/deadline.c
> > @@ -2419,12 +2419,18 @@ static struct task_struct *__pick_task_dl(struct rq *rq)
> >
> > if (dl_server(dl_se)) {
> > p = dl_se->server_pick_task(dl_se);
> > - if (!p) {
> > + if (p) {
> > + rq->dl_server = dl_se;
> > + } else if (WARN_ON_ONCE(on_dl_rq(dl_se))) {
> > + /*
> > + * If server_pick_task returns NULL and dlserver is
> > + * enqueued, we have a problem. Lets yield and do a
> > + * pick again.
> > + */
> > dl_se->dl_yielded = 1;
> > update_curr_dl_se(rq, dl_se, 0);
> > goto again;
> > }
> > - rq->dl_server = dl_se;
> > } else {
> > p = dl_task_of(dl_se);
> > }
>
> Hmm.. so fundamentally that yield() makes sense, but yeah, it's lost
> track of the fact that we've stopped the server and it should not
> continue.
>
> Does something like the below make sense?
At least it stopped crashing for me.
https://drive.codethink.co.uk/s/XRqN2y9BLwPsD9H
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index d380bffee2ef..abebeb67de4e 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -664,6 +664,7 @@ struct sched_dl_entity {
> unsigned int dl_non_contending : 1;
> unsigned int dl_overrun : 1;
> unsigned int dl_server : 1;
> + unsigned int dl_server_active : 1;
> unsigned int dl_defer : 1;
> unsigned int dl_defer_armed : 1;
> unsigned int dl_defer_running : 1;
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index d9d5a702f1a6..e2b542f684db 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -1647,6 +1647,7 @@ void dl_server_start(struct sched_dl_entity *dl_se)
> if (!dl_se->dl_runtime)
> return;
>
> + dl_se->dl_server_active = 1;
> enqueue_dl_entity(dl_se, ENQUEUE_WAKEUP);
> if (!dl_task(dl_se->rq->curr) || dl_entity_preempt(dl_se, &rq->curr->dl))
> resched_curr(dl_se->rq);
> @@ -1661,6 +1662,7 @@ void dl_server_stop(struct sched_dl_entity *dl_se)
> hrtimer_try_to_cancel(&dl_se->dl_timer);
> dl_se->dl_defer_armed = 0;
> dl_se->dl_throttled = 0;
> + dl_se->dl_server_active = 0;
> }
>
> void dl_server_init(struct sched_dl_entity *dl_se, struct rq *rq,
> @@ -2420,8 +2422,10 @@ static struct task_struct *__pick_task_dl(struct rq *rq)
> if (dl_server(dl_se)) {
> p = dl_se->server_pick_task(dl_se);
> if (!p) {
> - dl_se->dl_yielded = 1;
> - update_curr_dl_se(rq, dl_se, 0);
> + if (dl_se->dl_server_active) {
> + dl_se->dl_yielded = 1;
> + update_curr_dl_se(rq, dl_se, 0);
> + }
> goto again;
> }
> rq->dl_server = dl_se;
Powered by blists - more mailing lists