[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABAubThfDMnA8g5Fxdwwu79V9sEDgQYvvWY675757LZXnyMcKQ@mail.gmail.com>
Date: Mon, 14 Sep 2015 22:32:42 -0700
From: Shayan Pooya <shayan@...eve.org>
To: Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
linux-kernel@...r.kernel.org
Subject: [PATCH] sched/fair: adjust the depth of a sched_entity when its
parent changes
>From 64a24d04c6510dcc144aba123fb21ed6f895c6b7 Mon Sep 17 00:00:00 2001
From: Shayan Pooya <shayan@...eve.org>
Date: Mon, 14 Sep 2015 21:25:09 -0700
Subject: [PATCH] sched/fair: adjust the depth of a sched_entity when its
parent changes
Fixes commit fed14d45f945 ("sched/fair: Track cgroup depth")
Hit this kernel panic mentioned in https://lkml.org/lkml/2014/2/15/217
when running docker with kernel 3.16.
The issue has been reported other places including:
https://github.com/docker/docker/issues/13940
https://gist.github.com/burke/c60dc5b8f0ba9bfd9275
The latter also has an analysis and a similar patch (which was never
submitted to lkml).
Looking into the panic (RIP: check_preempt_wakeup+255) and the code:
<check_preempt_wakeup+248>: mov 0x148(%rbx),%rbx
<check_preempt_wakeup+255>: mov 0x150(%r12),%rdi
<check_preempt_wakeup+263>: cmp 0x150(%rbx),%rdi
And:
crash> p &((struct sched_entity *)0)->cfs_rq
$10 = (struct cfs_rq **) 0x150
Which suggests the inlined function find_matching_se and the while loop
in it. Looking into the task that was about to get scheduled in the
check_preempt_wakeup function:
crash> p ((struct task_struct *) 0xffff8808506fd180)->se.depth
$2 = 1
crash> p ((struct task_struct *) 0xffff8808506fd180)->se.parent
$3 = (struct sched_entity *) 0xffff8808533c0c00
crash> p ((struct task_struct *) 0xffff8808506fd180)->se.parent->depth
$4 = 1
Which is incorrect and the root-cause of the panic.
The modified code is the only place that the depth was not adjusted after
potentially modifying the parent.
Signed-off-by: Shayan Pooya <shayan@...eve.org>
---
kernel/sched/fair.c | 1 -
kernel/sched/sched.h | 1 +
2 files changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 6e2e348..ced5534 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8035,7 +8035,6 @@ static void task_move_group_fair(struct
task_struct *p, int queued)
if (!queued)
se->vruntime -= cfs_rq_of(se)->min_vruntime;
set_task_rq(p, task_cpu(p));
- se->depth = se->parent ? se->parent->depth + 1 : 0;
if (!queued) {
cfs_rq = cfs_rq_of(se);
se->vruntime += cfs_rq->min_vruntime;
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 68cda11..507d30f 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -931,6 +931,7 @@ static inline void set_task_rq(struct task_struct
*p, unsigned int cpu)
#ifdef CONFIG_FAIR_GROUP_SCHED
p->se.cfs_rq = tg->cfs_rq[cpu];
p->se.parent = tg->se[cpu];
+ p->se.depth = p->se.parent ? p->se.parent->depth + 1 : 0;
#endif
#ifdef CONFIG_RT_GROUP_SCHED
--
2.1.0
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists