linux-kernel - Re: [PATCH 11/16] rcu: Check for spurious wakeup using return value

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20140724181223.GO11241@linux.vnet.ibm.com>
Date:	Thu, 24 Jul 2014 11:12:23 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Pranith Kumar <bobby.prani@...il.com>
Cc:	Josh Triplett <josh@...htriplett.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	Lai Jiangshan <laijs@...fujitsu.com>,
	"open list:READ-COPY UPDATE..." <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 11/16] rcu: Check for spurious wakeup using return value

On Thu, Jul 24, 2014 at 12:03:34AM -0400, Pranith Kumar wrote:
> On Wed, Jul 23, 2014 at 11:43 PM, Paul E. McKenney
> <paulmck@...ux.vnet.ibm.com> wrote:
> > On Wed, Jul 23, 2014 at 10:36:19PM -0400, Pranith Kumar wrote:
> >> On Wed, Jul 23, 2014 at 8:26 AM, Paul E. McKenney
> >> <paulmck@...ux.vnet.ibm.com> wrote:
> >> > On Wed, Jul 23, 2014 at 01:09:48AM -0400, Pranith Kumar wrote:
> >> >> When the gp_kthread wakes up from the wait event, it returns 0 if the wake up is
> >> >> due to the condition having been met. This commit checks this return value
> >> >> for a spurious wake up before calling rcu_gp_init().
> >> >>
> >> >> Signed-off-by: Pranith Kumar <bobby.prani@...il.com>
> >> >
> >> > How does this added check help?  I don't see that it does.  If the flag
> >> > is set, we want to wake up.  If we get a spurious wakeup, but then the
> >> > flag gets set before we actually wake up, we still want to wake up.
> >>
> >> So I took a look at the docs again, and using the return value is the
> >> recommended way to check for spurious wakeups.
> >>
> >> The condition in wait_event_interruptible() is checked when the task
> >> is woken up (either due to stray signals or explicitly) and it returns
> >> true if condition evaluates to true.
> 
> this should be returns '0' if the condition evaluates to true.

Ah, but if the condition changes while wait_event_interruptible() is in
the process of returning, it is quite possible that the answer will be
different afterwards.  For example, consider this scenario:

o	wait_event_interruptible() decides to return spuriously for
	whatever reason, and thus returns a non-zero value.

o	An invocation of (say) rcu_start_gp_advanced() sets ->gp_flags
	to RCU_GP_FLAG_INIT, thus requesting that a new grace period
	start.

o	At this point, the return value says that we should not start
	a new grace period, but the ->gp_flags value says that we
	should.

Because it is the ->gp_flags value that really knows, the current code
ignores wait_event_interruptible()'s return value and just checks
the ->gp_flags value.

> >> In the current scenario, if we get a spurious wakeup, we take the
> >> costly path of checking this condition again (with a barrier and lock)
> >> before going back to wait.
> >>
> >> The scenario of getting an actual wakeup after getting a spurious
> >> wakeup exists even today, this is the window after detecting a
> >> spurious wakeup and before going back to wait. I am not sure if using
> >> the return value enlarges that window as we are going back to sleep
> >> immediately.
> >>
> >> Thoughts?
> >
> > If the flag is set, why should we care whether or not the wakeup was
> > spurious?  If the flag is not set, why should we care whether or not
> > wait_event_interruptible() thought that the wakeup was not spurious?
> 
> A correction about the return value above: return will be 0 if the
> condition is true, in this case if the flag is set.
> 
> If the flag is set, ret will be 0 and we will go ahead with
> rcu_gp_init(). (no change wrt current behavior)

Sorry, this is not always correct.  RCU is highly concurrent, so you do
need to start thinking in terms of concurrency.  Your reasoning above
is a symptom of single-threaded thinking.  Please see my scenario above
showing how the return can be non-zero even though ->gp_flags is set.

> If the flag is not set, currently we go ahead and call rcu_gp_init()
> from where we check if the flag is set (after a lock+barrier) and
> return.

True enough.  Is that really a problem?  If so, exactly why is it a
problem?

> If we care about what wait_event_interruptible() returns, we can go
> back and wait for an actual wakeup much earlier without the additional
> overhead of calling rcu_gp_init().

The key phrase here is "If we care".  Should we care?  If so, why?

I suggest running some random benchmark and counting how many times
rcu_gp_init() is called and how many times rcu_gp_init() returns
because ->gp_flags is not set.  If rcu_gp_init() returns because
->gp_flags is not set a significant fraction of the time, then this
-might- be worth worrying about.  (Extra credit: Under what conditions
would it be worth worrying about, and how would you go about checking
to see whether those conditions hold?)

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/