lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 8 Nov 2010 03:11:36 +0100
From:	"Udo A. Steinberg" <udo@...ervisor.org>
To:	paulmck@...ux.vnet.ibm.com
Cc:	Joe Korty <joe.korty@...r.com>, fweisbec@...il.com,
	mathieu.desnoyers@...icios.com, dhowells@...hat.com,
	loic.minier@...aro.org, dhaval.giani@...il.com, tglx@...utronix.de,
	peterz@...radead.org, linux-kernel@...r.kernel.org,
	josh@...htriplett.org
Subject: Re: [PATCH] a local-timer-free version of RCU

On Sat, 6 Nov 2010 12:28:12 -0700 Paul E. McKenney (PEM) wrote:

PEM> > + * rcu_quiescent() is called from rcu_read_unlock() when a
PEM> > + * RCU batch was started while the rcu_read_lock/rcu_read_unlock
PEM> > + * critical section was executing.
PEM> > + */
PEM> > +
PEM> > +void rcu_quiescent(int cpu)
PEM> > +{
PEM> 
PEM> What prevents two different CPUs from calling this concurrently?
PEM> Ah, apparently nothing -- the idea being that
PEM> rcu_grace_period_complete() sorts it out.  Though if the second CPU was
PEM> delayed, it seems like it might incorrectly end a subsequent grace
PEM> period as follows:
PEM> 
PEM> o	CPU 0 clears the second-to-last bit.
PEM> 
PEM> o	CPU 1 clears the last bit.
PEM> 
PEM> o	CPU 1 sees that the mask is empty, so invokes
PEM> 	rcu_grace_period_complete(), but is delayed in the function
PEM> 	preamble.
PEM> 
PEM> o	CPU 0 sees that the mask is empty, so invokes
PEM> 	rcu_grace_period_complete(), ending the grace period.
PEM> 	Because the RCU_NEXT_PENDING is set, it also starts
PEM> 	a new grace period.
PEM> 
PEM> o	CPU 1 continues in rcu_grace_period_complete(), incorrectly
PEM> 	ending the new grace period.
PEM> 
PEM> Or am I missing something here?

The scenario you describe seems possible. However, it should be easily fixed
by passing the perceived batch number as another parameter to rcu_set_state()
and making it part of the cmpxchg. So if the caller tries to set state bits
on a stale batch number (e.g., batch != rcu_batch), it can be detected.

There is a similar, although harmless, issue in call_rcu(): Two CPUs can
concurrently add callbacks to their respective nxt list and compute the same
value for nxtbatch. One CPU succeeds in setting the PENDING bit while
observing COMPLETE to be clear, so it starts a new batch. Afterwards, the
other CPU also sets the PENDING bit, but this time for the next batch. So
it ends up requesting nxtbatch+1, although there is no need to. This also
would be fixed by making the batch number part of the cmpxchg.

Cheers,

	- Udo

Download attachment "signature.asc" of type "application/pgp-signature" (199 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ