[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1507302609.2793.16.camel@redhat.com>
Date: Fri, 06 Oct 2017 17:10:09 +0200
From: Paolo Abeni <pabeni@...hat.com>
To: paulmck@...ux.vnet.ibm.com
Cc: linux-kernel@...r.kernel.org,
Josh Triplett <josh@...htriplett.org>,
Steven Rostedt <rostedt@...dmis.org>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Hannes Frederic Sowa <hannes@...essinduktion.org>,
netdev@...r.kernel.org
Subject: Re: [PATCH 0/4] RCU: introduce noref debug
Hi,
On Fri, 2017-10-06 at 06:34 -0700, Paul E. McKenney wrote:
> On Fri, Oct 06, 2017 at 02:57:45PM +0200, Paolo Abeni wrote:
> > The networking subsystem is currently using some kind of long-lived
> > RCU-protected, references to avoid the overhead of full book-keeping.
> >
> > Such references - skb_dst() noref - are stored inside the skbs and can be
> > moved across relevant slices of the network stack, with the users
> > being in charge of properly clearing the relevant skb - or properly refcount
> > the related dst references - before the skb escapes the RCU section.
> >
> > We currently don't have any deterministic debug infrastructure to check
> > the dst noref usages - and the introduction of others noref artifact is
> > currently under discussion.
> >
> > This series tries to tackle the above introducing an RCU debug infrastructure
> > aimed at spotting incorrect noref pointer usage, in patch one. The
> > infrastructure is small and must be explicitly enabled via a newly introduced
> > build option.
> >
> > Patch two uses such infrastructure to track dst noref usage in the networking
> > stack.
> >
> > Patch 3 and 4 are bugfixes for small buglet found running this infrastructure
> > on basic scenarios.
Thank you for the prompt reply!
>
> This patchset does not look like it handles rcu_read_lock() nesting.
> For example, given code like this:
>
> void foo(void)
> {
> rcu_read_lock();
> rcu_track_noref(&key2, &noref2, true);
> do_something();
> rcu_track_noref(&key2, &noref2, false);
> rcu_read_unlock();
> }
>
> void bar(void)
> {
> rcu_read_lock();
> rcu_track_noref(&key1, &noref1, true);
> do_something_more();
> foo();
> do_something_else();
> rcu_track_noref(&key1, &noref1, false);
> rcu_read_unlock();
> }
>
> void grill(void)
> {
> foo();
> }
>
> It looks like foo()'s rcu_read_unlock() will complain about key1.
> You could remove foo()'s rcu_read_lock() and rcu_read_unlock(), but
> that will break the call from grill().
Actually the code should cope correctly with your example; when foo()'s
rcu_read_unlock() is called, 'cache' contains:
{ { &key1, &noref1, 1}, // ...
and when the related __rcu_check_noref() is invoked preempt_count() is
2 - because the check is called before decreasing the preempt counter.
In the main loop inside __rcu_check_noref() we will hit always the
'continue' statement because 'cache->store[i].nesting != nesting', so
no warn will be triggered.
> Or am I missing something subtle here? Given patch 3/4, I suspect not...
The problem with the code in patch 3/4 is different; currently
ip_route_input_noref() is basically doing:
rcu_read_lock();
rcu_track_noref(&key1, &noref1, true);
rcu_read_unlock();
So the rcu lock there silence any RCU based check inside
ip_route_input_noref() but does not really protect the noref dst.
Please let me know if the above clarify the scenario.
Thanks,
Paolo
Powered by blists - more mailing lists