linux-kernel - sched: memory corruption on completing completions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <54D2AA16.6030706@oracle.com>
Date:	Wed, 04 Feb 2015 18:24:06 -0500
From:	Sasha Levin <sasha.levin@...cle.com>
To:	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...nel.org>
CC:	Andrew Morton <akpm@...ux-foundation.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrey Ryabinin <a.ryabinin@...sung.com>,
	Dave Jones <davej@...emonkey.org.uk>,
	LKML <linux-kernel@...r.kernel.org>
Subject: sched: memory corruption on completing completions

Hi all,

I was fuzzing with trinity on a -next kernel with the KASan patchset, and
got what initially appeared to be a rather odd trace:

[  856.817966] BUG: AddressSanitizer: out of bounds on stack in do_raw_spin_unlock+0x417/0x4f0 at addr ffff8803875c7c42
[  856.817966] Read of size 2 by task migration/15/140
[  856.821565] page:ffffea000e1d71c0 count:0 mapcount:0 mapping:          (null) index:0x0
[  856.821565] flags: 0x2efffff80000000()
[  856.821565] page dumped because: kasan: bad access detected
[  856.821565] CPU: 15 PID: 140 Comm: migration/15 Not tainted 3.19.0-rc5-next-20150123-sasha-00061-g527ff0d-dirty #1814
[  856.841712]  0000000000000000 0000000000000000 ffff8804d0447aa0 ffff8804d04479e8
[  856.843478]  ffffffff92eb033b 1ffffd4001c3ae3f ffffea000e1d71c0 ffff8804d0447a88
[  856.843478]  ffffffff81b4c292 ffff8804d0447a58 ffffffff815ef1e1 0000000000000000
[  856.843478] Call Trace:
[  856.843478] dump_stack (lib/dump_stack.c:52)
[  856.843478] kasan_report_error (mm/kasan/report.c:136 mm/kasan/report.c:194)
[  856.843478] ? __lock_is_held (kernel/locking/lockdep.c:3518)
[  856.843478] ? sched_ttwu_pending (kernel/sched/core.c:4921)
[  856.843478] __asan_report_load2_noabort (mm/kasan/report.c:234)
[  856.843478] ? do_raw_spin_unlock (./arch/x86/include/asm/spinlock.h:157 kernel/locking/spinlock_debug.c:159)
[  856.843478] do_raw_spin_unlock (./arch/x86/include/asm/spinlock.h:157 kernel/locking/spinlock_debug.c:159)
[  856.843478] ? do_raw_spin_trylock (kernel/locking/spinlock_debug.c:157)
[  856.843478] _raw_spin_unlock_irqrestore (include/linux/spinlock_api_smp.h:162 kernel/locking/spinlock.c:191)
[  856.843478] complete (kernel/sched/completion.c:37)
[  856.843478] cpu_stop_signal_done (kernel/stop_machine.c:69)
[  856.843478] cpu_stopper_thread (include/linux/spinlock.h:342 kernel/stop_machine.c:456)
[  856.843478] ? irq_cpu_stop_queue_work (kernel/stop_machine.c:449)
[  856.843478] ? do_raw_spin_trylock (kernel/locking/spinlock_debug.c:157)
[  856.843478] ? _raw_spin_unlock_irqrestore (./arch/x86/include/asm/paravirt.h:809 include/linux/spinlock_api_smp.h:162 kernel/locking/spinlock.c:191)
[  856.843478] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2554 kernel/locking/lockdep.c:2601)
[  856.843478] ? _raw_spin_unlock_irqrestore (./arch/x86/include/asm/preempt.h:95 include/linux/spinlock_api_smp.h:163 kernel/locking/spinlock.c:191)
[  856.843478] smpboot_thread_fn (kernel/smpboot.c:161)
[  856.843478] ? smpboot_unpark_thread (kernel/smpboot.c:105)
[  856.843478] ? schedule (kernel/sched/core.c:2853)
[  856.843478] ? __kthread_parkme (./arch/x86/include/asm/current.h:14 kernel/kthread.c:164)
[  856.843478] ? smpboot_unpark_thread (kernel/smpboot.c:105)
[  856.843478] kthread (kernel/kthread.c:207)
[  856.843478] ? flush_kthread_work (kernel/kthread.c:176)
[  856.843478] ? schedule_tail (./arch/x86/include/asm/preempt.h:95 kernel/sched/core.c:2318)
[  856.843478] ? flush_kthread_work (kernel/kthread.c:176)
[  856.843478] ret_from_fork (arch/x86/kernel/entry_64.S:349)
[  856.843478] ? flush_kthread_work (kernel/kthread.c:176)
[  856.843478] Memory state around the buggy address:
[  856.843478]  ffff8803875c7b00: f3 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  856.843478]  ffff8803875c7b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[  856.843478] >ffff8803875c7c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f1
[  856.843478]                                            ^
[  856.843478]  ffff8803875c7c80: f1 f1 f1 00 f4 f4 f4 f2 f2 f2 f2 00 00 f4 f4 00
[  856.843478]  ffff8803875c7d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

We see this trace only now because Andrey has recently made KASan able to detect issues
with memory on stack, rather than just kmalloc()ed memory, so the underlying bug has
probably been there for a while.

Initially we couldn't explain it. It was reproducing often, and always on the
spin_unlock_irqrestore() call at the end of the complete() code.

I now have a theory for why it happens:

Thread A				Thread B
----------------------------------------------------------

[Enter function]
DECLARE_COMPLETION_ONSTACK(x)
wait_for_completion(x)
					complete(x)
					[In complete(x):]
				        spin_lock_irqsave(&x->wait.lock, flags);
				        x->done++;
				        __wake_up_locked(&x->wait, TASK_NORMAL, 1);
[Done waiting, wakes up]
[Exit function]
				        spin_unlock_irqrestore(&x->wait.lock, flags);



So the spin_unlock_irqrestore() at the end of complete() would proceed to corruption
the stack of thread A.


Thanks,
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/