[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150228233053.GK15405@linux.vnet.ibm.com>
Date: Sat, 28 Feb 2015 15:30:53 -0800
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Andi Kleen <ak@...ux.intel.com>,
Andi Kleen <andi@...stfloor.org>, x86@...nel.org,
linux-kernel@...r.kernel.org, oleg@...hat.com,
rusty@...tcorp.com.au, mingo@...nel.org
Subject: Re: [RFC][PATCH] module: Optimize __module_address() using a latched
RB-tree
On Fri, Feb 27, 2015 at 11:01:14AM +0100, Peter Zijlstra wrote:
> On Thu, Feb 26, 2015 at 11:41:44AM -0800, Paul E. McKenney wrote:
> > > As per the above argument; without doing the whole READ/WRITE_ONCE for
> > > the rb tree primitives, I think we're fine. We don't actually need the
> > > read_barrier_depends() I think.
> > >
> > > I think this because raw_read_seqcount() will issue an rmb, which should
> > > be enough for the 'stable' tree. For the unstable one we don't care as
> > > long as we don't see split loads/stores.
> >
> > I agree that raw_read_seqcount() will issue an rmb(), but that won't help.
> > For Alpha, you need the smp_rmb() to be between the load of any given
> > pointer and the first dereference of that same pointer.
>
> OK, it seems I'm confused on Alpha, perhaps you can clarify.
>
> My confusion stems from the fact that when things are properly
> serialized with locks we do not need these additional barriers.
The readers hold locks? Keep in mind that raw_read_seqcount() isn't
really a lock -- updates can run concurrently with a read. A doomed
read, to be sure, but nevertheless a read that is traversing pointers
that are changing out from under it.
> Then why would we need them to correctly iterate the stable copy of the
> tree?
>
> I appreciate we would need them to correctly iterate an true lockless
> data structure, but our in-flight copy of the tree cannot be correctly
> iterated anyway, all we want is for that iteration to:
>
> 1) only observe valid nodes -- ie. no pointers out into the wild due
> to split load/stores.
>
> 2) terminate -- ie. not get stuck in loops.
True, but don't new nodes get inserted into a tree that might possibly
be concurrently read?
> And I don't see that read_barrier_depends() helping with either for the
> unstable tree; 1) is guaranteed by my patch making the modifiers user
> WRITE_ONCE(), and 2) we agreed is guaranteed by the modifiers not having
> loops in program order, any cache effects will disappear over time.
Well, if the readers and writer are really truly excluding each other,
then no problem. But if the readers really can run concurrently with
writers, even if the readers will subsequently retry, you do need those
barriers on Alpha.
The canonical URL on this topic:
http://www.openvms.compaq.com/wizard/wiz_2637.html
> > > No, I think someone is 'hoping' sync_rcu() implies sync_sched(). But
> > > yes, I should look harder at this. I was assuming the existing code was
> > > OK here.
> >
> > I am not blaming you for the code itself, but rather just blaming you
> > for causing me to notice that the code was broken. ;-)
> >
> > How about if I make something that allows overlapping grace periods,
> > so that we could be safe without latency penalty? One approach would
> > be a synchronize_rcu_mult() with a bitmask indicating which to wait for.
> > Another would be to make variants of get_state_synchronize_rcu() and
> > cond_synchronize_rcu() for RCU-sched as well as for RCU, but also to
> > make get_state_synchronize_rcu() force a grace period to start if
> > one was not already in progress.
>
> This is module unload, not a fast path by anyones measure I think.
> However if you do go about making such a primitive I know of at least
> one other place this could be used; we have the following in
> _cpu_down():
>
> #ifdef CONFIG_PREEMPT
> synchronize_sched();
> #endif
> synchronize_rcu();
OK, definitely seems worth thinking about, then.
Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists