[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110927160100.GC2335@linux.vnet.ibm.com>
Date: Tue, 27 Sep 2011 09:01:00 -0700
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Pavel Ivanov <paivanof@...il.com>
Cc: Frederic Weisbecker <fweisbec@...il.com>,
LKML <linux-kernel@...r.kernel.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Thomas Gleixner <tglx@...utronix.de>,
Lai Jiangshan <laijs@...fujitsu.com>,
Ingo Molnar <mingo@...hat.com>
Subject: Re: [PATCH 1/7] rcu: Fix preempt-unsafe debug check of rcu
extended quiescent state
On Tue, Sep 27, 2011 at 11:16:02AM -0400, Pavel Ivanov wrote:
> On Tue, Sep 27, 2011 at 7:50 AM, Frederic Weisbecker <fweisbec@...il.com> wrote:
> > On Mon, Sep 26, 2011 at 06:04:06PM -0400, Pavel Ivanov wrote:
> >> On Mon, Sep 26, 2011 at 6:19 AM, Frederic Weisbecker <fweisbec@...il.com> wrote:
> >> > In the rcu_check_extended_qs() function that is used to check
> >> > illegal uses of RCU under extended quiescent states, we look
> >> > at the local value of dynticks that is even if an extended
> >> > quiescent state or odd otherwise.
> >> >
> >> > We are looking at it without disabling the preemption though
> >> > and this opens a race window where we may read the state of
> >> > a remote CPU instead, like in the following scenario:
> >> >
> >> > CPU 1 CPU 2
> >> >
> >> > bool rcu_check_extended_qs(void)
> >> > {
> >> > struct rcu_dynticks *rdtp;
> >> >
> >> > rdtp = &per_cpu(rcu_dynticks,
> >> > raw_smp_processor_id());
> >> >
> >> > < ---- Task is migrated here ---- >
> >> >
> >> > // CPU 1 goes idle and increase rdtp->dynticks
> >> > /* Here we are reading the value for
> >> > the remote CPU 1 instead of the local CPU */
> >> > if (atomic_read(&rdtp->dynticks) & 0x1)
> >> > return false;
> >> >
> >> > return true;
> >> > }
> >> >
> >> > The possible result of this is false positive return value of that
> >> > function, suggesting we are in an extended quiescent state in random
> >> > places.
> >>
> >> How is this different from what your patch allows?
> >>
> >> CPU 1 CPU 2
> >>
> >> bool rcu_check_extended_qs(void)
> >> {
> >> struct rcu_dynticks *rdtp =
> >> &get_cpu_var(rcu_dynticks);
> >> bool ext_qs = true;
> >>
> >> if (atomic_read(&rdtp->dynticks) & 0x1)
> >> ext_qs = false;
> >>
> >> put_cpu_var(rcu_dynticks);
> >>
> >> < ---- Task is migrated here ---- >
> >>
> >> /* Here we return true/false
> >> based on the value for the
> >> remote CPU 1 instead of the
> >> local CPU */
> >>
> >> return ext_qs;
> >> }
> >>
> >>
> >> Looks like it can result in the same false positive result.
> >
> > Why?
> >
> > While calling rcu_read_lock(), the task can still be migrated
> > for example from CPU 0 to CPU 1, until we do our check
> > in rcu_check_extended_qs() with preemption disabled. But that's
> > not a problem, we are going to do our check either in CPU 0 or
> > CPU 1, that's doesn't matter much.
>
> I don't have sources at hand now to check all call sites of
> rcu_check_extended_qs(). But are you saying that
> rcu_check_extended_qs() is always called after rcu_read_lock() ? If
> that's the case then preemption is already disabled, right? Otherwise
> I don't understand your reasoning here.
For kernels built with CONFIG_TREE_PREEMPT_RCU or CONFIG_TINY_PREEMPT_RCU,
rcu_read_lock() does not imply preempt_disable(). So you really can be
in an RCU read-side critical section without having preemption disabled.
> > What matters is that we do that check by ensuring we are really
> > checking the value of the cpu var in the CPU we are currently
> > running and not some other random one that can change its dynticks
> > value at the same time.
>
> Define the "CPU we are currently running on" in this context. Is it
> CPU executing call to rcu_check_extended_qs() or is it CPU executing
> return from rcu_check_extended_qs() ? These CPUs can be different both
> before your patch and after that. And function can return extended_qs
> state from either of these CPUs again both before and after the patch.
> If the calling code wants these CPUs to be the same it has to disable
> preemption before making the call. And if it does so then additional
> preemption disabling inside the function is pointless.
>
> All your patch does is changing possible scenarios of preemption...
> Wait a minute... Is that the whole point of the patch? If CPU where
> rcu_check_extended_qs() was called is in an extended qs then function
> cannot be preempted and so it will correctly return true from current
> CPU. If CPU where rcu_check_extended_qs() was called is not in
> extended qs then function can be preempted and migrated. But CPU where
> it migrates to won't be in extended qs either and thus function will
> correctly return false no matter which per_cpu variable is read. Is my
> understanding correct now? If so it's better to be explained in the
> commit log.
>
> BTW, isn't it possible for CPU to wake up from extended qs after some
> interrupt coming in the middle of the function?
Indeed this can happen. The calls to rcu_irq_enter() and
rcu_irq_exit() handle this transition from dyntick-idle extended
quiescent state to RCU being usable in the interrrupt handler.
Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists