lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Wed, 12 Apr 2017 10:39:59 -0700 From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com> To: linux-kernel@...r.kernel.org Cc: mingo@...nel.org, jiangshanlai@...il.com, dipankar@...ibm.com, akpm@...ux-foundation.org, mathieu.desnoyers@...icios.com, josh@...htriplett.org, tglx@...utronix.de, peterz@...radead.org, rostedt@...dmis.org, dhowells@...hat.com, edumazet@...gle.com, fweisbec@...il.com, oleg@...hat.com, bobby.prani@...il.com, "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com> Subject: [PATCH tip/core/rcu 14/40] rcu: Place guard on rcu_all_qs() and rcu_note_context_switch() actions The rcu_all_qs() and rcu_note_context_switch() do a series of checks, taking various actions to supply RCU with quiescent states, depending on the outcomes of the various checks. This is a bit much for scheduling fastpaths, so this commit creates a separate ->rcu_urgent_qs field in the rcu_dynticks structure that acts as a global guard for these checks. Thus, in the common case, rcu_all_qs() and rcu_note_context_switch() check the ->rcu_urgent_qs field, find it false, and simply return. Signed-off-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@...radead.org> --- .../Design/Data-Structures/Data-Structures.html | 11 +++++- kernel/rcu/tree.c | 43 +++++++++++++--------- kernel/rcu/tree.h | 3 +- kernel/rcu/tree_exp.h | 2 + kernel/rcu/tree_plugin.h | 8 +++- 5 files changed, 45 insertions(+), 22 deletions(-) diff --git a/Documentation/RCU/Design/Data-Structures/Data-Structures.html b/Documentation/RCU/Design/Data-Structures/Data-Structures.html index e4bf20a68fa3..4dec89097559 100644 --- a/Documentation/RCU/Design/Data-Structures/Data-Structures.html +++ b/Documentation/RCU/Design/Data-Structures/Data-Structures.html @@ -1106,6 +1106,7 @@ Its fields are as follows: 3 atomic_t dynticks; 4 bool rcu_need_heavy_qs; 5 unsigned long rcu_qs_ctr; + 6 bool rcu_urgent_qs; </pre> <p>The <tt>->dynticks_nesting</tt> field counts the @@ -1131,12 +1132,20 @@ it is willing to call for heavy-weight dyntick-counter operations. This flag is checked by RCU's context-switch and <tt>cond_resched()</tt> code, which provide a momentary idle sojourn in response. -</p><p>Finally the <tt>->rcu_qs_ctr</tt> field is used to record +</p><p>The <tt>->rcu_qs_ctr</tt> field is used to record quiescent states from <tt>cond_resched()</tt>. Because <tt>cond_resched()</tt> can execute quite frequently, this must be quite lightweight, as in a non-atomic increment of this per-CPU field. +</p><p>Finally, the <tt>->rcu_urgent_qs</tt> field is used to record +the fact that the RCU core code would really like to see a quiescent +state from the corresponding CPU, with the various other fields indicating +just how badly RCU wants this quiescent state. +This flag is checked by RCU's context-switch and <tt>cond_resched()</tt> +code, which, if nothing else, non-atomically increment <tt>->rcu_qs_ctr</tt> +in response. + <table> <tr><th> </th></tr> <tr><th align="left">Quick Quiz:</th></tr> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index e0c75488f30d..ef4b8f63c06d 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -466,10 +466,16 @@ void rcu_note_context_switch(void) trace_rcu_utilization(TPS("Start context switch")); rcu_sched_qs(); rcu_preempt_note_context_switch(); + /* Load rcu_urgent_qs before other flags. */ + if (!smp_load_acquire(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs))) + goto out; + this_cpu_write(rcu_dynticks.rcu_urgent_qs, false); if (unlikely(raw_cpu_read(rcu_dynticks.rcu_need_heavy_qs))) rcu_momentary_dyntick_idle(); for_each_rcu_flavor(rsp) do_nocb_deferred_wakeup(this_cpu_ptr(rsp->rda)); + this_cpu_inc(rcu_dynticks.rcu_qs_ctr); +out: trace_rcu_utilization(TPS("End context switch")); barrier(); /* Avoid RCU read-side critical sections leaking up. */ } @@ -493,34 +499,28 @@ void rcu_all_qs(void) unsigned long flags; struct rcu_state *rsp; + if (!raw_cpu_read(rcu_dynticks.rcu_urgent_qs)) + return; + preempt_disable(); + /* Load rcu_urgent_qs before other flags. */ + if (!smp_load_acquire(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs))) { + preempt_enable(); + return; + } + this_cpu_write(rcu_dynticks.rcu_urgent_qs, false); barrier(); /* Avoid RCU read-side critical sections leaking down. */ if (unlikely(raw_cpu_read(rcu_dynticks.rcu_need_heavy_qs))) { local_irq_save(flags); rcu_momentary_dyntick_idle(); local_irq_restore(flags); } - if (unlikely(raw_cpu_read(rcu_sched_data.cpu_no_qs.b.exp))) { - /* - * Yes, we just checked a per-CPU variable with preemption - * enabled, so we might be migrated to some other CPU at - * this point. That is OK because in that case, the - * migration will supply the needed quiescent state. - * We might end up needlessly disabling preemption and - * invoking rcu_sched_qs() on the destination CPU, but - * the probability and cost are both quite low, so this - * should not be a problem in practice. - */ - preempt_disable(); + if (unlikely(raw_cpu_read(rcu_sched_data.cpu_no_qs.b.exp))) rcu_sched_qs(); - preempt_enable(); - } - for_each_rcu_flavor(rsp) { - preempt_disable(); + for_each_rcu_flavor(rsp) do_nocb_deferred_wakeup(this_cpu_ptr(rsp->rda)); - preempt_enable(); - } this_cpu_inc(rcu_dynticks.rcu_qs_ctr); barrier(); /* Avoid RCU read-side critical sections leaking up. */ + preempt_enable(); } EXPORT_SYMBOL_GPL(rcu_all_qs); @@ -1256,6 +1256,7 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp, { unsigned long jtsq; bool *rnhqp; + bool *ruqp; unsigned long rjtsc; struct rcu_node *rnp; @@ -1291,11 +1292,15 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp, * might not be the case for nohz_full CPUs looping in the kernel. */ rnp = rdp->mynode; + ruqp = per_cpu_ptr(&rcu_dynticks.rcu_urgent_qs, rdp->cpu); if (time_after(jiffies, rdp->rsp->gp_start + jtsq) && READ_ONCE(rdp->rcu_qs_ctr_snap) != per_cpu(rcu_dynticks.rcu_qs_ctr, rdp->cpu) && READ_ONCE(rdp->gpnum) == rnp->gpnum && !rdp->gpwrap) { trace_rcu_fqs(rdp->rsp->name, rdp->gpnum, rdp->cpu, TPS("rqc")); return 1; + } else { + /* Load rcu_qs_ctr before store to rcu_urgent_qs. */ + smp_store_release(ruqp, true); } /* Check for the CPU being offline. */ @@ -1331,6 +1336,8 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp, (time_after(jiffies, rdp->rsp->gp_start + jtsq) || time_after(jiffies, rdp->rsp->jiffies_resched))) { WRITE_ONCE(*rnhqp, true); + /* Store rcu_need_heavy_qs before rcu_urgent_qs. */ + smp_store_release(ruqp, true); rdp->rsp->jiffies_resched += 5; /* Re-enable beating. */ } diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index b212cd0f22c7..d2f276fc2edc 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -113,8 +113,9 @@ struct rcu_dynticks { /* Process level is worth LLONG_MAX/2. */ int dynticks_nmi_nesting; /* Track NMI nesting level. */ atomic_t dynticks; /* Even value for idle, else odd. */ - bool rcu_need_heavy_qs; /* GP old, need heavy quiescent state. */ + bool rcu_need_heavy_qs; /* GP old, need heavy quiescent state. */ unsigned long rcu_qs_ctr; /* Light universal quiescent state ctr. */ + bool rcu_urgent_qs; /* GP old need light quiescent state. */ #ifdef CONFIG_NO_HZ_FULL_SYSIDLE long long dynticks_idle_nesting; /* irq/process nesting level from idle. */ diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index a7b639ccd46e..a1f52bbe9db6 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -331,6 +331,8 @@ static void sync_sched_exp_handler(void *data) return; } __this_cpu_write(rcu_sched_data.cpu_no_qs.b.exp, true); + /* Store .exp before .rcu_urgent_qs. */ + smp_store_release(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs), true); resched_cpu(smp_processor_id()); } diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 0e2114e8076b..4a9df7ea89a4 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1863,7 +1863,9 @@ static void __call_rcu_nocb_enqueue(struct rcu_data *rdp, trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, TPS("WakeEmpty")); } else { - rdp->nocb_defer_wakeup = RCU_NOGP_WAKE; + WRITE_ONCE(rdp->nocb_defer_wakeup, RCU_NOGP_WAKE); + /* Store ->nocb_defer_wakeup before ->rcu_urgent_qs. */ + smp_store_release(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs), true); trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, TPS("WakeEmptyIsDeferred")); } @@ -1875,7 +1877,9 @@ static void __call_rcu_nocb_enqueue(struct rcu_data *rdp, trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, TPS("WakeOvf")); } else { - rdp->nocb_defer_wakeup = RCU_NOGP_WAKE_FORCE; + WRITE_ONCE(rdp->nocb_defer_wakeup, RCU_NOGP_WAKE_FORCE); + /* Store ->nocb_defer_wakeup before ->rcu_urgent_qs. */ + smp_store_release(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs), true); trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, TPS("WakeOvfIsDeferred")); } -- 2.5.2
Powered by blists - more mailing lists