lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20140429153752.GE8754@linux.vnet.ibm.com>
Date:	Tue, 29 Apr 2014 08:37:52 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	Pranith Kumar <bobby.prani@...il.com>
Cc:	LKML <linux-kernel@...r.kernel.org>, mingo@...nel.org,
	laijs@...fujitsu.com, dipankar@...ibm.com,
	Andrew Morton <akpm@...ux-foundation.org>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	josh@...htriplett.org, niv@...ibm.com, tglx@...utronix.de,
	peterz@...radead.org, rostedt@...dmis.org, dhowells@...hat.com,
	edumazet@...gle.com, darren@...art.com, fweisbec@...il.com,
	Oleg Nesterov <oleg@...hat.com>, sbw@....edu
Subject: Re: [PATCH tip/core/rcu 2/3] documentation: Record rcu_dereference()
 value mishandling

On Tue, Apr 29, 2014 at 01:42:13AM -0400, Pranith Kumar wrote:
> Minor nits below:
> 
> Other than that Acked-by: Pranith Kumar <bobby.prani@...il.com>
> 
> On Tue, Apr 29, 2014 at 1:04 AM, Andev <debiandev@...il.com> wrote:
> > From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
> >
> > Recent LKML discussings (see http://lwn.net/Articles/586838/ and
> > http://lwn.net/Articles/588300/ for the LWN writeups) brought out
> > some ways of misusing the return value from rcu_dereference() that
> > are not necessarily completely intuitive.  This commit therefore
> > documents what can and cannot safely be done with these values.
> >
> > Signed-off-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
> snip
> > +
> > +       o       The pointer is never dereferenced after being compared.
> > +               Since there are no subsequent dereferences, the compiler
> > +               cannot use anything it learned from the comparison
> > +               to reorder the non-existent subsequent dereferences.
> > +               This sort of comparison occurs frequently when scanning
> > +               RCU-protected circular linked lists.
> > +
> > +       o       The comparison is against a pointer pointer that
> 
> duplicate pointer, remove one

Good catch, fixed!

> > +               references memory that was initialized "a long time ago."
> > +               The reason this is safe is that even if misordering
> > +               occurs, the misordering will not affect the accesses
> > +               that follow the comparison.  So exactly how long ago is
> > +               "a long time ago"?  Here are some possibilities:
> snip
> > +       o       All of the accesses following the comparison are stores,
> > +               so that a control dependency preserves the needed ordering.
> > +               That said, it is easy to get control dependencies wrong.
> > +               Please see the "CONTROL DEPENDENCIES" section of
> > +               Documentation/memory-barriers.txt for more details.
> > +
> > +       o       The pointers compared not-equal -and- the compiler does
> 
> add in "are " - The pointers compared are not-equal...

Actually, "compared" is a verb here.  But that use is a bit obscure, so
taking your suggestion as a bug report.  I changed it to read:

	The pointers are not equal -and- the compiler does not have
	enough information to deduce the value of the pointer.

Fair enough?

> > +               not have enough information to deduce the value of the
> > +               pointer.  Note that the volatile cast in rcu_dereference()
> > +               will normally prevent the compiler from knowing too much.
> > +
> > +o      Disable any value-speculation optimizations that your compiler
> > +       might provide, especially if you are making use of feedback-based
> > +       optimizations that take data collected from prior runs.  Such
> > +       value-speculation optimizations reorder operations by design.
> > +
> > +       There is one exception to this rule:  Value-speculation
> > +       optimizations that leverage the branch-prediction hardware are
> > +       safe on strongly ordered systems (such as x86), but not on weakly
> > +       ordered systems (such as ARM or Power).  Choose your compiler
> > +       command-line options wisely!
> > +
> > +
> > +EXAMPLE OF AMPLIFIED RCU-USAGE BUG
> > +
> > +Because updaters can run concurrently with RCU readers, RCU readers can
> > +see stale and/or inconsistent values.  If RCU readers need fresh or
> > +consistent values, which they sometimes do, they need to take proper
> > +precautions.  To see this, consider the following code fragment:
> > +
> > +       struct foo {
> > +               int a;
> > +               int b;
> > +               int c;
> > +       };
> > +       struct foo *gp1;
> > +       struct foo *gp2;
> > +
> > +       void updater(void)
> > +       {
> > +               struct foo *p;
> > +
> > +               p = kmalloc(...);
> > +               if (p == NULL)
> > +                       deal_with_it();
> > +               p->a = 42;  /* Each field in its own cache line. */
> > +               p->b = 43;
> > +               p->c = 44;
> > +               rcu_assign_pointer(gp1, p);
> > +               p->b = 143;
> > +               p->c = 144;
> > +               rcu_assign_pointer(gp2, p);
> > +       }
> > +
> > +       void reader(void)
> > +       {
> > +               struct foo *p;
> > +               struct foo *q;
> > +               int r1, r2;
> > +
> > +               p = rcu_dereference(gp2);
> > +               r1 = p->b;  /* Guaranteed to get 143. */
> > +               q = rcu_dereference(gp1);
> > +               if (p == q) {
> > +                       /* The compiler decides that q->c is same as p->c. */
> > +                       r2 = p->c; /* Could get 44 on weakly order system. */
> > +               }
> > +       }
> > +
> > +You might be surprised that the outcome (r1 == 143 && r2 == 44) is possible,
> > +but you should not be.  After all, the updater might have been invoked
> > +a second time between the time reader() loaded into "r1" and the time
> > +that it loaded into "r2".  The fact that this same result can occur due
> > +to some reordering from the compiler and CPUs is beside the point.
> > +
> > +But suppose that the reader needs a consistent view?
> > +
> > +Then one approach is to use locking, for example, as follows:
> > +
> > +       struct foo {
> > +               int a;
> > +               int b;
> > +               int c;
> > +               spinlock_t lock;
> > +       };
> > +       struct foo *gp1;
> > +       struct foo *gp2;
> > +
> > +       void updater(void)
> > +       {
> > +               struct foo *p;
> > +
> > +               p = kmalloc(...);
> > +               if (p == NULL)
> > +                       deal_with_it();
> > +               spin_lock(&p->lock);
> > +               p->a = 42;  /* Each field in its own cache line. */
> > +               p->b = 43;
> > +               p->c = 44;
> > +               spin_unlock(&p->lock);
> > +               rcu_assign_pointer(gp1, p);
> > +               spin_lock(&p->lock);
> > +               p->b = 143;
> > +               p->c = 144;
> > +               spin_unlock(&p->lock);
> > +               rcu_assign_pointer(gp2, p);
> > +       }
> > +
> > +       void reader(void)
> > +       {
> > +               struct foo *p;
> > +               struct foo *q;
> > +               int r1, r2;
> > +
> > +               p = rcu_dereference(gp2);
> > +               spin_lock(&p->lock);
> > +               r1 = p->b;  /* Guaranteed to get 143. */
> > +               q = rcu_dereference(gp1);
> > +               if (p == q) {
> > +                       /* The compiler decides that q->c is same as p->c. */
> > +                       r2 = p->c; /* Could get 44 on weakly order system. */
> > +               }
> > +               spin_unlock(&p->lock);
> > +       }
> 
> shouldn't the comment here reflect that r2 can never get 44 and only
> can get 144 once you use a lock?

Indeed it should, good catch, fixed!

I also need to check the load from gp2 for NULL, fixed that too.  (If
gp2 is non-NULL, the later load from gp1 is guaranteed to be non-NULL.)

							Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ