linux-kernel - Re: [PATCH tip/core/rcu 06/10] trace: Eliminate cond_resched_rcu_qs() in favor of cond

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20180225174927.GC2855@linux.vnet.ibm.com>
Date:   Sun, 25 Feb 2018 09:49:27 -0800
From:   "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     linux-kernel@...r.kernel.org, mingo@...nel.org,
        jiangshanlai@...il.com, dipankar@...ibm.com,
        akpm@...ux-foundation.org, mathieu.desnoyers@...icios.com,
        josh@...htriplett.org, tglx@...utronix.de, peterz@...radead.org,
        dhowells@...hat.com, edumazet@...gle.com, fweisbec@...il.com,
        oleg@...hat.com, Ingo Molnar <mingo@...hat.com>
Subject: Re: [PATCH tip/core/rcu 06/10] trace: Eliminate
 cond_resched_rcu_qs() in favor of cond_resched()

On Sat, Feb 24, 2018 at 03:12:40PM -0500, Steven Rostedt wrote:
> On Fri,  1 Dec 2017 11:21:40 -0800
> "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com> wrote:
> 
> > Now that cond_resched() also provides RCU quiescent states when
> > needed, it can be used in place of cond_resched_rcu_qs().  This
> > commit therefore makes this change.
> 
> Are you sure this is true?

Up to a point.  If a given CPU has been blocking an RCU grace period for
long enough, that CPU's rcu_dynticks.rcu_need_heavy_qs will be set, and
then the next cond_resched() will be treated as a cond_resched_rcu_qs().

However, to your point, if there is no grace period in progress or if 
the current grace period is not waiting on the CPU in question or if
the grace-period kthread is starved of CPU, then cond_resched() has no
effect on RCU.  Unless of course it results in a context switch.

> I just bisected a lock up on my machine down to this commit.
> 
> With CONFIG_TRACEPOINT_BENCHMARK=y
> 
> # cd linux.git/tools/testing/selftests/ftrace/
> # ./ftracetest test.d/ftrace/func_traceonoff_triggers.tc
> 
> Locks up with a backtrace of:
> 
> [  614.186509] INFO: rcu_tasks detected stalls on tasks:

Ah, but this is RCU-tasks!  Which never sets rcu_dynticks.rcu_need_heavy_qs,
thus needing a real context switch.

Hey, when you said that synchronize_rcu_tasks() could take a very long
time, I took you at your word!  ;-)

Does the following (untested, probably does not even build) patch make
cond_resched() take a more peremptory approach to RCU-tasks?

							Thanx, Paul

------------------------------------------------------------------------

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 0c337f5ba3c4..5155fe5e7702 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -1088,12 +1088,16 @@ EXPORT_SYMBOL_GPL(rcu_is_watching);
 void rcu_request_urgent_qs_task(struct task_struct *t)
 {
 	int cpu;
+	struct rcu_dynticks *rdtp;
 
 	barrier();
 	cpu = task_cpu(t);
 	if (!task_curr(t))
 		return; /* This task is not running on that CPU. */
-	smp_store_release(per_cpu_ptr(&rcu_dynticks.rcu_urgent_qs, cpu), true);
+	rdtp = per_cpu_ptr(&rcu_dynticks, cpu);
+	WRITE_ONCE(rdtp->rcu_need_heavy_qs, true);
+	/* Store rcu_need_heavy_qs before rcu_urgent_qs. */
+	smp_store_release(&rdtp->rcu_urgent_qs, true);
 }
 
 #if defined(CONFIG_PROVE_RCU) && defined(CONFIG_HOTPLUG_CPU)