[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALCETrWjMrdJPHSjA64rJpVCaEtZCr-ZeE08sKP1-rbvw-4+tQ@mail.gmail.com>
Date: Wed, 9 Nov 2016 03:14:35 -0800
From: Andy Lutomirski <luto@...capital.net>
To: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc: Chris Metcalf <cmetcalf@...lanox.com>,
Gilad Ben Yossef <giladb@...lanox.com>,
Steven Rostedt <rostedt@...dmis.org>,
Ingo Molnar <mingo@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Rik van Riel <riel@...hat.com>, Tejun Heo <tj@...nel.org>,
Frederic Weisbecker <fweisbec@...il.com>,
Thomas Gleixner <tglx@...utronix.de>,
Christoph Lameter <cl@...ux.com>,
Viresh Kumar <viresh.kumar@...aro.org>,
Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will.deacon@....com>,
Daniel Lezcano <daniel.lezcano@...aro.org>,
Francis Giraldeau <francis.giraldeau@...il.com>,
Andi Kleen <andi@...stfloor.org>,
Arnd Bergmann <arnd@...db.de>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: task isolation discussion at Linux Plumbers
On Tue, Nov 8, 2016 at 5:40 PM, Paul E. McKenney
<paulmck@...ux.vnet.ibm.com> wrote:
> commit 49961e272333ac720ac4ccbaba45521bfea259ae
> Author: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
> Date: Tue Nov 8 14:25:21 2016 -0800
>
> rcu: Maintain special bits at bottom of ->dynticks counter
>
> Currently, IPIs are used to force other CPUs to invalidate their TLBs
> in response to a kernel virtual-memory mapping change. This works, but
> degrades both battery lifetime (for idle CPUs) and real-time response
> (for nohz_full CPUs), and in addition results in unnecessary IPIs due to
> the fact that CPUs executing in usermode are unaffected by stale kernel
> mappings. It would be better to cause a CPU executing in usermode to
> wait until it is entering kernel mode to
missing words here?
>
> This commit therefore reserves a bit at the bottom of the ->dynticks
> counter, which is checked upon exit from extended quiescent states. If it
> is set, it is cleared and then a new rcu_dynticks_special_exit() macro
> is invoked, which, if not supplied, is an empty single-pass do-while loop.
> If this bottom bit is set on -entry- to an extended quiescent state,
> then a WARN_ON_ONCE() triggers.
>
> This bottom bit may be set using a new rcu_dynticks_special_set()
> function, which returns true if the bit was set, or false if the CPU
> turned out to not be in an extended quiescent state. Please note that
> this function refuses to set the bit for a non-nohz_full CPU when that
> CPU is executing in usermode because usermode execution is tracked by
> RCU as a dyntick-idle extended quiescent state only for nohz_full CPUs.
I'm inclined to suggest s/dynticks/eqs/ in the public API. To me,
"dynticks" is a feature, whereas "eqs" means "extended quiescent
state" and means something concrete about the CPU state
>
> Reported-by: Andy Lutomirski <luto@...capital.net>
> Signed-off-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
>
> diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
> index 4f9b2fa2173d..130d911e4ba1 100644
> --- a/include/linux/rcutiny.h
> +++ b/include/linux/rcutiny.h
> @@ -33,6 +33,11 @@ static inline int rcu_dynticks_snap(struct rcu_dynticks *rdtp)
> return 0;
> }
>
> +static inline bool rcu_dynticks_special_set(int cpu)
> +{
> + return false; /* Never flag non-existent other CPUs! */
> +}
> +
> static inline unsigned long get_state_synchronize_rcu(void)
> {
> return 0;
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index dbf20b058f48..8de83830e86b 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -279,23 +279,36 @@ static DEFINE_PER_CPU(struct rcu_dynticks, rcu_dynticks) = {
> };
>
> /*
> + * Steal a bit from the bottom of ->dynticks for idle entry/exit
> + * control. Initially this is for TLB flushing.
> + */
> +#define RCU_DYNTICK_CTRL_MASK 0x1
> +#define RCU_DYNTICK_CTRL_CTR (RCU_DYNTICK_CTRL_MASK + 1)
> +#ifndef rcu_dynticks_special_exit
> +#define rcu_dynticks_special_exit() do { } while (0)
> +#endif
> +
> /*
> @@ -305,17 +318,21 @@ static void rcu_dynticks_eqs_enter(void)
> static void rcu_dynticks_eqs_exit(void)
> {
> struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
> + int seq;
>
> /*
> - * CPUs seeing atomic_inc() must see prior idle sojourns,
> + * CPUs seeing atomic_inc_return() must see prior idle sojourns,
> * and we also must force ordering with the next RCU read-side
> * critical section.
> */
> - smp_mb__before_atomic(); /* See above. */
> - atomic_inc(&rdtp->dynticks);
> - smp_mb__after_atomic(); /* See above. */
> + seq = atomic_inc_return(&rdtp->dynticks);
> WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
> - !(atomic_read(&rdtp->dynticks) & 0x1));
> + !(seq & RCU_DYNTICK_CTRL_CTR));
> + if (seq & RCU_DYNTICK_CTRL_MASK) {
> + atomic_and(~RCU_DYNTICK_CTRL_MASK, &rdtp->dynticks);
> + smp_mb__after_atomic(); /* Clear bits before acting on them */
> + rcu_dynticks_special_exit();
I think this needs to be reversed for NMI safety: do the callback and
then clear the bits.
> +/*
> + * Set the special (bottom) bit of the specified CPU so that it
> + * will take special action (such as flushing its TLB) on the
> + * next exit from an extended quiescent state. Returns true if
> + * the bit was successfully set, or false if the CPU was not in
> + * an extended quiescent state.
> + */
> +bool rcu_dynticks_special_set(int cpu)
> +{
> + int old;
> + int new;
> + struct rcu_dynticks *rdtp = &per_cpu(rcu_dynticks, cpu);
> +
> + do {
> + old = atomic_read(&rdtp->dynticks);
> + if (old & RCU_DYNTICK_CTRL_CTR)
> + return false;
> + new = old | ~RCU_DYNTICK_CTRL_MASK;
Shouldn't this be old | RCU_DYNTICK_CTRL_MASK?
> + } while (atomic_cmpxchg(&rdtp->dynticks, old, new) != old);
> + return true;
> }
--Andy
Powered by blists - more mailing lists