lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Mon, 6 Mar 2017 10:24:24 +0100 From: Dmitry Vyukov <dvyukov@...gle.com> To: Paul McKenney <paulmck@...ux.vnet.ibm.com> Cc: josh@...htriplett.org, Steven Rostedt <rostedt@...dmis.org>, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, jiangshanlai@...il.com, LKML <linux-kernel@...r.kernel.org>, syzkaller <syzkaller@...glegroups.com> Subject: Re: rcu: WARNING in rcu_seq_end On Sun, Mar 5, 2017 at 7:47 PM, Paul E. McKenney <paulmck@...ux.vnet.ibm.com> wrote: > On Sun, Mar 05, 2017 at 11:50:39AM +0100, Dmitry Vyukov wrote: >> On Sat, Mar 4, 2017 at 9:40 PM, Paul E. McKenney >> <paulmck@...ux.vnet.ibm.com> wrote: >> > On Sat, Mar 04, 2017 at 05:01:19PM +0100, Dmitry Vyukov wrote: >> >> Hello, >> >> >> >> Paul, you wanted bugs in rcu. >> > >> > Well, whether I want them or not, I must deal with them. ;-) >> > >> >> I've got this WARNING while running syzkaller fuzzer on >> >> 86292b33d4b79ee03e2f43ea0381ef85f077c760: >> >> >> >> ------------[ cut here ]------------ >> >> WARNING: CPU: 0 PID: 4832 at kernel/rcu/tree.c:3533 >> >> rcu_seq_end+0x110/0x140 kernel/rcu/tree.c:3533 >> >> Kernel panic - not syncing: panic_on_warn set ... >> >> CPU: 0 PID: 4832 Comm: kworker/0:3 Not tainted 4.10.0+ #276 >> >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 >> >> Workqueue: events wait_rcu_exp_gp >> >> Call Trace: >> >> __dump_stack lib/dump_stack.c:15 [inline] >> >> dump_stack+0x2ee/0x3ef lib/dump_stack.c:51 >> >> panic+0x1fb/0x412 kernel/panic.c:179 >> >> __warn+0x1c4/0x1e0 kernel/panic.c:540 >> >> warn_slowpath_null+0x2c/0x40 kernel/panic.c:583 >> >> rcu_seq_end+0x110/0x140 kernel/rcu/tree.c:3533 >> >> rcu_exp_gp_seq_end kernel/rcu/tree_exp.h:36 [inline] >> >> rcu_exp_wait_wake+0x8a9/0x1330 kernel/rcu/tree_exp.h:517 >> >> rcu_exp_sel_wait_wake kernel/rcu/tree_exp.h:559 [inline] >> >> wait_rcu_exp_gp+0x83/0xc0 kernel/rcu/tree_exp.h:570 >> >> process_one_work+0xc06/0x1c20 kernel/workqueue.c:2096 >> >> worker_thread+0x223/0x19c0 kernel/workqueue.c:2230 >> >> kthread+0x326/0x3f0 kernel/kthread.c:227 >> >> ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430 >> >> Dumping ftrace buffer: >> >> (ftrace buffer empty) >> >> Kernel Offset: disabled >> >> Rebooting in 86400 seconds.. >> >> >> >> >> >> Not reproducible. But looking at the code, shouldn't it be: >> >> >> >> static void rcu_seq_end(unsigned long *sp) >> >> { >> >> smp_mb(); /* Ensure update-side operation before counter increment. */ >> >> + WARN_ON_ONCE(!(*sp & 0x1)); >> >> WRITE_ONCE(*sp, *sp + 1); >> >> - WARN_ON_ONCE(*sp & 0x1); >> >> } >> >> >> >> ? >> >> >> >> Otherwise wait_event in _synchronize_rcu_expedited can return as soon >> >> as WRITE_ONCE(*sp, *sp + 1) finishes. As far as I understand this >> >> consequently can allow start of next grace periods. Which in turn can >> >> make the warning fire. Am I missing something? >> >> >> >> I don't see any other bad consequences of this. The rest of >> >> rcu_exp_wait_wake can proceed when _synchronize_rcu_expedited has >> >> returned and destroyed work on stack and next period has started and >> >> ended, but it seems OK. >> > >> > I believe that this is a heygood change, but I don't see how it will >> > help in this case. BTW, may I have your Signed-off-by? >> > >> > The reason I don't believe that it will help is that the >> > rcu_exp_gp_seq_end() function is called from a workqueue handler that >> > is invoked holding ->exp_mutex, and this mutex is not released until >> > after the handler invokes rcu_seq_end() and then wakes up the task that >> > scheduled the workqueue handler. So the ordering above should not matter >> > (but I agree that your ordering is cleaner. >> > >> > That said, it looks like I am missing some memory barriers, please >> > see the following patch. >> > >> > But what architecture did you see this on? >> >> >> This is just x86. >> >> You seem to assume that wait_event() waits for the wakeup. It does not >> work this way. It can return as soon as the condition becomes true >> without ever waiting: >> >> 305 #define wait_event(wq, condition) \ >> 306 do { \ >> 307 might_sleep(); \ >> 308 if (condition) \ >> 309 break; \ >> 310 __wait_event(wq, condition); \ >> 311 } while (0) > > Agreed, hence my patch in the previous email. I guess I knew that, but Ah, you meant to synchronize rcu_seq_end with rcu_seq_done? I think you placed the barrier incorrectly for that. rcu_exp_wait_wake is already too late. The write that unblocks waiter is in rcu_seq_end so you need a release barrier _before_ that write. Also can we please start using smp_load_acquire/smp_store_release where they are what doctor said. They are faster, more readable, better for race detectors _and_ would prevent you from introducing this bug, because you would need to find the exact write that signifies completion. I.e.: diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index d80c2587bed8..aa7ba83f6a56 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -3534,7 +3534,7 @@ static void rcu_seq_start(unsigned long *sp) static void rcu_seq_end(unsigned long *sp) { smp_mb(); /* Ensure update-side operation before counter increment. */ - WRITE_ONCE(*sp, *sp + 1); + smp_store_release(sp, *sp + 1); WARN_ON_ONCE(*sp & 0x1); } @@ -3554,7 +3554,7 @@ static unsigned long rcu_seq_snap(unsigned long *sp) */ static bool rcu_seq_done(unsigned long *sp, unsigned long s) { - return ULONG_CMP_GE(READ_ONCE(*sp), s); + return ULONG_CMP_GE(smp_load_acquire(sp), s); } > on the day I wrote that code, my fingers didn't. Or somew similar lame > excuse. ;-) > >> Mailed a signed patch: >> https://groups.google.com/d/msg/syzkaller/XzUXuAzKkCw/5054wU9MEAAJ > > This is the patch you also sent by email, that moves the WARN_ON_ONCE(), > thank you!
Powered by blists - more mailing lists