linux-kernel - Re: [PATCH tip/core/rcu 1/2] rcu: Parallelize and economize NOCB kthread wakeups

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJhHMCC6XUo0erp6mPyjiz5614DQq_+gfVaLR5s9YZkP=jPVCw@mail.gmail.com>
Date:	Sat, 23 Aug 2014 03:43:38 -0400
From:	Pranith Kumar <pranith@...ech.edu>
To:	Paul McKenney <paulmck@...ux.vnet.ibm.com>
Cc:	Amit Shah <amit.shah@...hat.com>,
	LKML <linux-kernel@...r.kernel.org>,
	Rik van Riel <riel@...hat.com>,
	Ingo Molnar <mingo@...nel.org>,
	Lai Jiangshan <laijs@...fujitsu.com>,
	Dipankar Sarma <dipankar@...ibm.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	Josh Triplett <josh@...htriplett.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <peterz@...radead.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	David Howells <dhowells@...hat.com>,
	Eric Dumazet <edumazet@...gle.com>, dvhart@...ux.intel.com,
	Frédéric Weisbecker <fweisbec@...il.com>,
	Oleg Nesterov <oleg@...hat.com>,
	Silas Boyd-Wickizer <sbw@....edu>
Subject: Re: [PATCH tip/core/rcu 1/2] rcu: Parallelize and economize NOCB
 kthread wakeups

On Fri, Aug 22, 2014 at 5:53 PM, Paul E. McKenney
<paulmck@...ux.vnet.ibm.com> wrote:
>
> Hmmm...  Please try replacing the synchronize_rcu() in
> __sysrq_swap_key_ops() with (say) schedule_timeout_interruptible(HZ / 10).
> I bet that gets rid of the hang.  (And also introduces a low-probability
> bug, but should be OK for testing.)
>
> The other thing to try is to revert your patch that turned my event
> traces into printk()s, then put an ftrace_dump(DUMP_ALL); just after
> the synchronize_rcu() -- that might make it so that the ftrace data
> actually gets dumped out.
>

I was able to reproduce this error on my Ubuntu 14.04 machine. I think
I found the root cause of the problem after several kvm runs.

The problem is that earlier we were waiting on nocb_head and now we
are waiting on nocb_leader_wake.

So there are a lot of nocb callbacks which are enqueued before the
nocb thread is spawned. This sets up nocb_head to be non-null, because
of which the nocb kthread used to wake up immediately after sleeping.

Now that we have switched to nocb_leader_wake, this is not being set
when there are pending callbacks, unless the callbacks overflow the
qhimark. The pending callbacks were around 7000 when the boot hangs.

So setting the qhimark using the boot parameter rcutree.qhimark=5000
is one way to allow us to boot past the point by forcefully waking up
the nocb kthread. I am not sure this is fool-proof.

Another option to start the nocb kthreads with nocb_leader_wake set,
so that it can handle any pending callbacks. The following patch also
allows us to boot properly.

Phew! Let me know if this makes any sense :)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 00dc411..4c397aa 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -2386,6 +2386,9 @@ static int rcu_nocb_kthread(void *arg)
        struct rcu_head **tail;
        struct rcu_data *rdp = arg;

+       if (rdp->nocb_leader == rdp)
+               rdp->nocb_leader_wake = true;
+
        /* Each pass through this loop invokes one batch of callbacks */
        for (;;) {
                /* Wait for callbacks. */

-- 
Pranith
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/