[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <tip-18f649ef344127ef6de23a5a4272dbe2fdb73dde@git.kernel.org>
Date: Tue, 22 Nov 2016 04:28:49 -0800
From: tip-bot for Oleg Nesterov <tipbot@...or.com>
To: linux-tip-commits@...r.kernel.org
Cc: mingo@...nel.org, peterz@...radead.org,
linux-kernel@...r.kernel.org, vlovejoy@...hat.com,
tglx@...utronix.de, oleg@...hat.com, efault@....de,
torvalds@...ux-foundation.org, hpa@...or.com
Subject: [tip:sched/urgent] sched/autogroup: Fix autogroup_move_group() to
never skip sched_move_task()
Commit-ID: 18f649ef344127ef6de23a5a4272dbe2fdb73dde
Gitweb: http://git.kernel.org/tip/18f649ef344127ef6de23a5a4272dbe2fdb73dde
Author: Oleg Nesterov <oleg@...hat.com>
AuthorDate: Mon, 14 Nov 2016 19:46:09 +0100
Committer: Ingo Molnar <mingo@...nel.org>
CommitDate: Tue, 22 Nov 2016 12:33:42 +0100
sched/autogroup: Fix autogroup_move_group() to never skip sched_move_task()
The PF_EXITING check in task_wants_autogroup() is no longer needed. Remove
it, but see the next patch.
However the comment is correct in that autogroup_move_group() must always
change task_group() for every thread so the sysctl_ check is very wrong;
we can race with cgroups and even sys_setsid() is not safe because a task
running with task_group() == ag->tg must participate in refcounting:
int main(void)
{
int sctl = open("/proc/sys/kernel/sched_autogroup_enabled", O_WRONLY);
assert(sctl > 0);
if (fork()) {
wait(NULL); // destroy the child's ag/tg
pause();
}
assert(pwrite(sctl, "1\n", 2, 0) == 2);
assert(setsid() > 0);
if (fork())
pause();
kill(getppid(), SIGKILL);
sleep(1);
// The child has gone, the grandchild runs with kref == 1
assert(pwrite(sctl, "0\n", 2, 0) == 2);
assert(setsid() > 0);
// runs with the freed ag/tg
for (;;)
sleep(1);
return 0;
}
crashes the kernel. It doesn't really need sleep(1), it doesn't matter if
autogroup_move_group() actually frees the task_group or this happens later.
Reported-by: Vern Lovejoy <vlovejoy@...hat.com>
Signed-off-by: Oleg Nesterov <oleg@...hat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Mike Galbraith <efault@....de>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: hartsjc@...hat.com
Cc: vbendel@...hat.com
Link: http://lkml.kernel.org/r/20161114184609.GA15965@redhat.com
Signed-off-by: Ingo Molnar <mingo@...nel.org>
---
kernel/sched/auto_group.c | 23 ++++++++++++-----------
1 file changed, 12 insertions(+), 11 deletions(-)
diff --git a/kernel/sched/auto_group.c b/kernel/sched/auto_group.c
index a5d966c..ad2b19a 100644
--- a/kernel/sched/auto_group.c
+++ b/kernel/sched/auto_group.c
@@ -111,14 +111,11 @@ bool task_wants_autogroup(struct task_struct *p, struct task_group *tg)
{
if (tg != &root_task_group)
return false;
-
/*
- * We can only assume the task group can't go away on us if
- * autogroup_move_group() can see us on ->thread_group list.
+ * If we race with autogroup_move_group() the caller can use the old
+ * value of signal->autogroup but in this case sched_move_task() will
+ * be called again before autogroup_kref_put().
*/
- if (p->flags & PF_EXITING)
- return false;
-
return true;
}
@@ -138,13 +135,17 @@ autogroup_move_group(struct task_struct *p, struct autogroup *ag)
}
p->signal->autogroup = autogroup_kref_get(ag);
-
- if (!READ_ONCE(sysctl_sched_autogroup_enabled))
- goto out;
-
+ /*
+ * We can't avoid sched_move_task() after we changed signal->autogroup,
+ * this process can already run with task_group() == prev->tg or we can
+ * race with cgroup code which can read autogroup = prev under rq->lock.
+ * In the latter case for_each_thread() can not miss a migrating thread,
+ * cpu_cgroup_attach() must not be possible after cgroup_exit() and it
+ * can't be removed from thread list, we hold ->siglock.
+ */
for_each_thread(p, t)
sched_move_task(t);
-out:
+
unlock_task_sighand(p, &flags);
autogroup_kref_put(prev);
}
Powered by blists - more mailing lists