lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Sun, 15 Nov 2015 01:45:45 +0000 From: Ben Hutchings <ben@...adent.org.uk> To: linux-kernel@...r.kernel.org, stable@...r.kernel.org CC: akpm@...ux-foundation.org, "Oleg Nesterov" <oleg@...hat.com>, "Linus Torvalds" <torvalds@...ux-foundation.org>, "Peter Zijlstra" <peterz@...radead.org>, "Ingo Molnar" <mingo@...nel.org>, will.deacon@....com, manfred@...orfullife.com, "Thomas Gleixner" <tglx@...utronix.de> Subject: [PATCH 3.2 26/60] sched/core: Fix TASK_DEAD race in finish_task_switch() 3.2.73-rc1 review patch. If anyone has any objections, please let me know. ------------------ From: Peter Zijlstra <peterz@...radead.org> commit 95913d97914f44db2b81271c2e2ebd4d2ac2df83 upstream. So the problem this patch is trying to address is as follows: CPU0 CPU1 context_switch(A, B) ttwu(A) LOCK A->pi_lock A->on_cpu == 0 finish_task_switch(A) prev_state = A->state <-. WMB | A->on_cpu = 0; | UNLOCK rq0->lock | | context_switch(C, A) `-- A->state = TASK_DEAD prev_state == TASK_DEAD put_task_struct(A) context_switch(A, C) finish_task_switch(A) A->state == TASK_DEAD put_task_struct(A) The argument being that the WMB will allow the load of A->state on CPU0 to cross over and observe CPU1's store of A->state, which will then result in a double-drop and use-after-free. Now the comment states (and this was true once upon a long time ago) that we need to observe A->state while holding rq->lock because that will order us against the wakeup; however the wakeup will not in fact acquire (that) rq->lock; it takes A->pi_lock these days. We can obviously fix this by upgrading the WMB to an MB, but that is expensive, so we'd rather avoid that. The alternative this patch takes is: smp_store_release(&A->on_cpu, 0), which avoids the MB on some archs, but not important ones like ARM. Reported-by: Oleg Nesterov <oleg@...hat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org> Acked-by: Linus Torvalds <torvalds@...ux-foundation.org> Cc: Peter Zijlstra <peterz@...radead.org> Cc: Thomas Gleixner <tglx@...utronix.de> Cc: linux-kernel@...r.kernel.org Cc: manfred@...orfullife.com Cc: will.deacon@....com Fixes: e4a52bcb9a18 ("sched: Remove rq->lock from the first half of ttwu()") Link: http://lkml.kernel.org/r/20150929124509.GG3816@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@...nel.org> [bwh: Backported to 3.2: - Adjust filename - As smp_store_release() is not defined, use smp_mb()] Signed-off-by: Ben Hutchings <ben@...adent.org.uk> --- --- a/kernel/sched.c +++ b/kernel/sched.c @@ -1016,8 +1016,10 @@ static inline void finish_lock_switch(st * After ->on_cpu is cleared, the task can be moved to a different CPU. * We must ensure this doesn't happen until the switch is completely * finished. + * + * Pairs with the control dependency and rmb in try_to_wake_up(). */ - smp_wmb(); + smp_mb(); prev->on_cpu = 0; #endif #ifdef CONFIG_DEBUG_SPINLOCK @@ -3191,11 +3192,11 @@ static void finish_task_switch(struct rq * If a task dies, then it sets TASK_DEAD in tsk->state and calls * schedule one last time. The schedule call will never return, and * the scheduled task must drop that reference. - * The test for TASK_DEAD must occur while the runqueue locks are - * still held, otherwise prev could be scheduled on another cpu, die - * there before we look at prev->state, and then the reference would - * be dropped twice. - * Manfred Spraul <manfred@...orfullife.com> + * + * We must observe prev->state before clearing prev->on_cpu (in + * finish_lock_switch), otherwise a concurrent wakeup can get prev + * running on another CPU and we could rave with its RUNNING -> DEAD + * transition, resulting in a double drop. */ prev_state = prev->state; finish_arch_switch(prev); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists