[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160301093242.GA29874@fixme-laptop.cn.ibm.com>
Date: Tue, 1 Mar 2016 17:32:42 +0800
From: Boqun Feng <boqun.feng@...il.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...nel.org>,
Josh Triplett <josh@...htriplett.org>,
Steven Rostedt <rostedt@...dmis.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Lai Jiangshan <jiangshanlai@...il.com>, sasha.levin@...cle.com
Subject: Re: [RFC v2 0/6] Track RCU dereferences in RCU read-side critical
sections
On Mon, Feb 29, 2016 at 01:43:18PM +0100, Peter Zijlstra wrote:
> On Mon, Feb 29, 2016 at 09:12:20AM +0800, Boqun Feng wrote:
>
> > The problem is:
> >
> > Currently there is no way to know which rcu_dereference() and its
> > friends a rcu_read_lock() or one of its friends is protecting. And
> > the lack of such information is a pain for kernel developers and
> > reviewers to understand the purpose of a certain rcu_read_lock().
>
> If I have to go run a kernel and look at /proc data in order to complete
> a review the patch will fail review, end of story.
>
OK, got it.
> > > Secondly, I'm not sure a random list of data is going to help anybody.
> > >
> >
> > One possible usage might be:
> >
> > If someone is going to refactor a piece of code which originally uses
> > RCU for synchronization, and now he wants to switch to other
> > synchronization mechanism for some data because of some reason(for
> > consistency or writer side latency guarantee, etc.). Before he makes the
> > change, he needs to know if this rcu_read_lock() also protects other
> > data, and this is not an easy job and I don't think we have anything
> > helpful for this. With RCU_LOCKED_ACCESS, he can build the kernel with
> > RCU_LOCKED_ACCESS enabled, run some workloads and get the information
> > from a /proc file. In this way, RCU_LOCKED_ACCESS is helpful to
> > developers who want to know the information of the associations between
> > rcu_read_lock() and rcu_dereference().
>
> So 'possible', but will anyone actually do this? `
>
Not sure, but one of the purposes of sending out this RFC is telling
people we have a way to do this and see if they are interested in ;-)
> > > The whole point of lockdep is that it tells you if you do broken. This
> > > doesn't, and cannot, because it lacks information to verify against.
> > >
> >
> > Yes.. RCU_LOCKED_ACCESS is not a validator, so it cannot be as useful as
> > lockdep for finding bug.
>
> I would try really _really_ hard to make it a validator, those are so
> much more useful.
>
Agreed. I would love to make it a validator too. But I think that
information collection is the first step and an essential part for
making a validator, right? And what RCU_LOCKED_ACCESS does are
collecting information and showing that via a proc file. I really think
it's better to know what information we can collect before we actually
try to make a validator, besides, the proc file output could serve as a
way to debug the validator if we really had a validator.
So maybe we can first discuss about and work on how to collect and show
the information, and then seek a way for making a validator?
> One could for example allow something like:
>
> rcu_read_lock();
> rcu_annotate(&var->field);
>
> foo();
>
> rcu_read_unlock();
>
> As an alternative to the syntax suggested by Ingo. This would allow
> keeping the existing rcu_read_lock() signature so you don't have to
> force update the entire kernel at once, while also (easily) allowing
> multiple variables. Like:
>
> rcu_read_lock();
> rcu_annotate(&var->field);
> rcu_annotate(&var2->field2);
>
> You can then have a special rule that if a particular RCU section has an
> annotation, any rcu_dereference() not matched will field a warning. If
> the annotation section is empty, nothing.
>
Good idea! but I don't think annotating a field in C language is easy,
I will try to see what we can get. Do you have something already in your
mind?
>
> > > So I'm still not sure this is useful. Also, I would argue your code has
> > > problems if you cannot even find your rcu_read_lock().
> > >
> >
> > I think what you mean here is, for example, the case where we use
> > preempt_disable() instead of rcu_read_lock_sched() to pair with
> > synchronize_sched(), right?
>
> No, I was more like:
>
> rcu_read_lock();
> foo()
> bar()
> var->func();
> obj->func();
> whatever();
>
> and you're looking at a change to whatever() and wonder where the heck
> the corresponding rcu_read_lock() lives and if we're having it held at
> all.
>
Confused.. RCU_LOCKED_ACCESS has such information, For example, in the
piece of /proc/locked_access/rcu I put in the cover letter, which I will
put in the commit logs for the next version of this series:
ACQCHAIN 0xfe042af3bbfb2605, 3 locks, irq_context 0:
LOCK at [<ffffffff81094b47>] SyS_kill+0x97/0x2a0
LOCK at [<ffffffff8109286f>] kill_pid_info+0x1f/0x140
LOCK at [<ffffffff81092605>] group_send_sig_info+0x5/0x130
ACCESS TYPE 1 at kernel/signal.c:695
, which actually reflects:
(in kernel/signal.c)
SyS_kill()
kill_something_info()
rcu_read_lock()
kill_pid_info()
rcu_read_lock()
group_send_sig_info()
rcu_read_lock()
check_kill_permission()
kill_ok_by_cred()
__task_cred()
and __task_cred() is an rcu_dereference(), so if anyone wants to figure
out what rcu_read_lock()s are hold when this deref happens, he can dump
the proc file, and search for the filename and linenumber of
__task_cred() and he will get the symbol + offset of all the
rcu_read_lock()s for this deref.
(Yes, I know there is an inconsistency on symbol + offset versus
filename + linenumber here, the reason of this inconsistency is that
the original purpose of this feature is to show some information to
developers to help understand code, therefore filename + linenumber is
more appropriate for this purpose. So I use filename + linenumber to
indicate the locations for rcu_dereference(), however, as what I can
only get from the task_struct::held_locks is acquire_ip, so I have to
use the symbol + offset for rcu_read_lock(), and there comes the
inconsistency. Maybe I should make them all symbol + offset, and use
tools like addr2line to convert to filename + linenumber.)
> > If so, sure, this series cannot do anything about this, however, this
>
> Why not? Can't you stick whatever you need into preempt_disable() ?
> kernel/sched/core.c:preempt_count_{add,sub}() are there for a (debug)
> reason.
>
Yeah, you are right. I can do something similar to preempt_disable() by
maintaining a stack of callsites of preempt_disable() for each thread,
which means a little much memory cost, but it's worthwhile if we make
it useful.
> > If you think this series deserves a reconsideration I will send out a
> > new version with more text about what's the problem and why this is
> > useful.
>
> So please make it a validator. That's so much more useful.
I cerntainly have done a bad job of the explanation of this series in
the cover letter and the commit logs, so I'm considering sending the
next version of this series with better text and examples, or you prefer
I should try to make it a validator first and send out as a whole for
reviewing?
Thank you for your thoughts and comments ;-)
Regards,
Boqun
Download attachment "signature.asc" of type "application/pgp-signature" (474 bytes)
Powered by blists - more mailing lists