linux-kernel - Re: rcu: WARNING in rcu_seq

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CACT4Y+b05W9KUcPuxYbA_7BdkbTao_ydG5Pnk-Yf=ZVuTBSvBQ@mail.gmail.com>
Date:   Sun, 5 Mar 2017 11:50:39 +0100
From:   Dmitry Vyukov <dvyukov@...gle.com>
To:     Paul McKenney <paulmck@...ux.vnet.ibm.com>
Cc:     josh@...htriplett.org, Steven Rostedt <rostedt@...dmis.org>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        jiangshanlai@...il.com, LKML <linux-kernel@...r.kernel.org>,
        syzkaller <syzkaller@...glegroups.com>
Subject: Re: rcu: WARNING in rcu_seq_end

On Sat, Mar 4, 2017 at 9:40 PM, Paul E. McKenney
<paulmck@...ux.vnet.ibm.com> wrote:
> On Sat, Mar 04, 2017 at 05:01:19PM +0100, Dmitry Vyukov wrote:
>> Hello,
>>
>> Paul, you wanted bugs in rcu.
>
> Well, whether I want them or not, I must deal with them.  ;-)
>
>> I've got this WARNING while running syzkaller fuzzer on
>> 86292b33d4b79ee03e2f43ea0381ef85f077c760:
>>
>> ------------[ cut here ]------------
>> WARNING: CPU: 0 PID: 4832 at kernel/rcu/tree.c:3533
>> rcu_seq_end+0x110/0x140 kernel/rcu/tree.c:3533
>> Kernel panic - not syncing: panic_on_warn set ...
>> CPU: 0 PID: 4832 Comm: kworker/0:3 Not tainted 4.10.0+ #276
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>> Workqueue: events wait_rcu_exp_gp
>> Call Trace:
>>  __dump_stack lib/dump_stack.c:15 [inline]
>>  dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
>>  panic+0x1fb/0x412 kernel/panic.c:179
>>  __warn+0x1c4/0x1e0 kernel/panic.c:540
>>  warn_slowpath_null+0x2c/0x40 kernel/panic.c:583
>>  rcu_seq_end+0x110/0x140 kernel/rcu/tree.c:3533
>>  rcu_exp_gp_seq_end kernel/rcu/tree_exp.h:36 [inline]
>>  rcu_exp_wait_wake+0x8a9/0x1330 kernel/rcu/tree_exp.h:517
>>  rcu_exp_sel_wait_wake kernel/rcu/tree_exp.h:559 [inline]
>>  wait_rcu_exp_gp+0x83/0xc0 kernel/rcu/tree_exp.h:570
>>  process_one_work+0xc06/0x1c20 kernel/workqueue.c:2096
>>  worker_thread+0x223/0x19c0 kernel/workqueue.c:2230
>>  kthread+0x326/0x3f0 kernel/kthread.c:227
>>  ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430
>> Dumping ftrace buffer:
>>    (ftrace buffer empty)
>> Kernel Offset: disabled
>> Rebooting in 86400 seconds..
>>
>>
>> Not reproducible. But looking at the code, shouldn't it be:
>>
>>  static void rcu_seq_end(unsigned long *sp)
>>  {
>>         smp_mb(); /* Ensure update-side operation before counter increment. */
>> +       WARN_ON_ONCE(!(*sp & 0x1));
>>         WRITE_ONCE(*sp, *sp + 1);
>> -       WARN_ON_ONCE(*sp & 0x1);
>>  }
>>
>> ?
>>
>> Otherwise wait_event in _synchronize_rcu_expedited can return as soon
>> as WRITE_ONCE(*sp, *sp + 1) finishes. As far as I understand this
>> consequently can allow start of next grace periods. Which in turn can
>> make the warning fire. Am I missing something?
>>
>> I don't see any other bad consequences of this. The rest of
>> rcu_exp_wait_wake can proceed when _synchronize_rcu_expedited has
>> returned and destroyed work on stack and next period has started and
>> ended, but it seems OK.
>
> I believe that this is a heygood change, but I don't see how it will
> help in this case.  BTW, may I have your Signed-off-by?
>
> The reason I don't believe that it will help is that the
> rcu_exp_gp_seq_end() function is called from a workqueue handler that
> is invoked holding ->exp_mutex, and this mutex is not released until
> after the handler invokes rcu_seq_end() and then wakes up the task that
> scheduled the workqueue handler.  So the ordering above should not matter
> (but I agree that your ordering is cleaner.
>
> That said, it looks like I am missing some memory barriers, please
> see the following patch.
>
> But what architecture did you see this on?


This is just x86.

You seem to assume that wait_event() waits for the wakeup. It does not
work this way. It can return as soon as the condition becomes true
without ever waiting:

305 #define wait_event(wq, condition)                                       \
306 do {                                                                    \
307         might_sleep();                                                  \
308         if (condition)                                                  \
309                 break;                                                  \
310         __wait_event(wq, condition);                                    \
311 } while (0)

Mailed a signed patch:
https://groups.google.com/d/msg/syzkaller/XzUXuAzKkCw/5054wU9MEAAJ