lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:	Wed, 12 Aug 2009 09:06:45 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	"Michael S. Tsirkin" <mst@...hat.com>
Cc:	Gregory Haskins <gregory.haskins@...il.com>,
	netdev@...r.kernel.org, virtualization@...ts.linux-foundation.org,
	"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	Ingo Molnar <mingo@...e.hu>, linux-mm@...ck.org,
	Andrew Morton <akpm@...ux-foundation.org>, hpa@...or.com
Subject: Re: [PATCHv2 2/2] vhost_net: a kernel-level virtio server

On Wed, Aug 12, 2009 at 06:51:54PM +0300, Michael S. Tsirkin wrote:
> On Wed, Aug 12, 2009 at 08:26:39AM -0700, Paul E. McKenney wrote:
> > On Wed, Aug 12, 2009 at 05:15:59PM +0300, Michael S. Tsirkin wrote:
> > > On Wed, Aug 12, 2009 at 07:11:07AM -0700, Paul E. McKenney wrote:
> > > > On Wed, Aug 12, 2009 at 04:25:40PM +0300, Michael S. Tsirkin wrote:
> > > > > On Wed, Aug 12, 2009 at 09:01:35AM -0400, Gregory Haskins wrote:
> > > > > > I think I understand what your comment above meant:  You don't need to
> > > > > > do synchronize_rcu() because you can flush the workqueue instead to
> > > > > > ensure that all readers have completed.
> > > > > 
> > > > > Yes.
> > > > > 
> > > > > >  But if thats true, to me, the
> > > > > > rcu_dereference itself is gratuitous,
> > > > > 
> > > > > Here's a thesis on what rcu_dereference does (besides documentation):
> > > > > 
> > > > > reader does this
> > > > > 
> > > > > 	A: sock = n->sock
> > > > > 	B: use *sock
> > > > > 
> > > > > Say writer does this:
> > > > > 
> > > > > 	C: newsock = allocate socket
> > > > > 	D: initialize(newsock)
> > > > > 	E: n->sock = newsock
> > > > > 	F: flush
> > > > > 
> > > > > 
> > > > > On Alpha, reads could be reordered.  So, on smp, command A could get
> > > > > data from point F, and command B - from point D (uninitialized, from
> > > > > cache).  IOW, you get fresh pointer but stale data.
> > > > > So we need to stick a barrier in there.
> > > > > 
> > > > > > and that pointer is *not* actually
> > > > > > RCU protected (nor does it need to be).
> > > > > 
> > > > > Heh, if readers are lockless and writer does init/update/sync,
> > > > > this to me spells rcu.
> > > > 
> > > > If you are using call_rcu(), synchronize_rcu(), or one of the
> > > > similar primitives, then you absolutely need rcu_read_lock() and
> > > > rcu_read_unlock(), or one of the similar pairs of primitives.
> > > 
> > > Right. I don't use any of these though.
> > > 
> > > > If you -don't- use rcu_read_lock(), then you are pretty much restricted
> > > > to adding data, but never removing it.
> > > > 
> > > > Make sense?  ;-)
> > > 
> > > Since I only access data from a workqueue, I replaced synchronize_rcu
> > > with workqueue flush. That's why I don't need rcu_read_lock.
> > 
> > Well, you -do- need -something- that takes on the role of rcu_read_lock(),
> > and in your case you in fact actually do.  Your equivalent of
> > rcu_read_lock() is the beginning of execution of a workqueue item, and
> > the equivalent of rcu_read_unlock() is the end of execution of that same
> > workqueue item.  Implicit, but no less real.
> 
> Well put. I'll add this to comments in my code.

Very good, thank you!!!

> > If a couple more uses like this show up, I might need to add this to
> > Documentation/RCU.  ;-)

And I idly wonder if this approach could replace SRCU.  Probably not
for protecting the CPU-hotplug notifier chains, but worth some thought.

							Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ