[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <656260cf50684c11a3122aca88dde0cb@SVR-IES-MBX-03.mgc.mentorg.com>
Date: Tue, 3 Dec 2019 10:51:46 +0000
From: "Schmid, Carsten" <Carsten_Schmid@...tor.com>
To: Peter Zijlstra <peterz@...radead.org>
CC: "mingo@...hat.com" <mingo@...hat.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: AW: Crash in fair scheduler
> > we had a crash in the fair scheduler and analysis shows that this could
> happen again.
> > Happened on 4.14.86 (LTS series) but failing code path still exists in 5.4-rc2
> (and 4.14.147 too).
>
> Please, do try if you can reproduce with Linus' latest git. I've no idea
> what is, or is not, in those stable trees.
>
unfortunately a once issue so far ...
--- snip ---
> > include/linux/rbtree.h:91:#define rb_first_cached(root) (root)-
> >rb_leftmost
>
> > struct sched_entity *__pick_first_entity(struct cfs_rq *cfs_rq)
> > {
> > struct rb_node *left = rb_first_cached(&cfs_rq->tasks_timeline);
> >
> > if (!left)
> > return NULL; <<<<<<<<<< the case
> >
> > return rb_entry(left, struct sched_entity, run_node);
> > }
>
> This the problem, for some reason the rbtree code got that rb_leftmost
> thing wrecked.
>
Any known issue on rbtree code regarding this?
> > Is this a corner case nobody thought of or do we have cfs_rq data that is
> unexpected in it's content?
>
> No, the rbtree is corrupt. Your tree has a single node (which matches
> with nr_running), but for some reason it thinks rb_leftmost is NULL.
> This is wrong, if the tree is non-empty, it must have a leftmost
> element.
Is there a chance to find the left-most element in the core dump?
Maybe i can dig deeper to find the root c ause then.
Does any of the structs/data in this context point to some memory
where i can continue to search?
Where should rb_leftmost point to if only one node is in the tree?
To the node itself?
>
> Can you reproduce at will? If so, can you please try the latest kernel,
> and or share the reproducer?
Unfortunately this was a "once" issue so far; i haven't a reproducer yet.
Thanks,
Carsten
Powered by blists - more mailing lists