[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120327164740.GS2450@linux.vnet.ibm.com>
Date: Tue, 27 Mar 2012 09:47:40 -0700
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: Ben Greear <greearb@...delatech.com>
Cc: Eric Dumazet <eric.dumazet@...il.com>,
David Miller <davem@...emloft.net>, netdev@...r.kernel.org,
gregkh@...uxfoundation.org
Subject: Re: RCU lock bug in 3.0.21 (bisected to: 682cb56a, fix NULL
dereferences in check_peer_redir)
On Mon, Mar 26, 2012 at 10:30:52PM -0700, Ben Greear wrote:
> On 03/26/2012 10:11 PM, Paul E. McKenney wrote:
> >On Tue, Mar 27, 2012 at 02:07:14AM +0200, Eric Dumazet wrote:
> >>On Mon, 2012-03-26 at 16:46 -0700, Ben Greear wrote:
> >>
> >>>The 3.0.21 kernel doesn't appear to have a rcu_read_lock_return(),
> >>>so I can't use your patch below.
> >>
> >>This patch was only to show the point (I also CCed Paul, he might have
> >>some time to think about it, after he clears the inline stuff with
> >>Linus)
> >
> >There is an rcu_preempt_depth() that returns rcu_read_lock() nesting
> >level for CONFIG_PREEMPT_RCU=y on the one hand and returns zero
> >for CONFIG_PREEMPT_RCU=n on the other. So if you can reproduce
> >with CONFIG_PREEMPT_RCU=y, you can substitute rcu_preempt_depth()
> >rcu_read_lock_return() in Eric's earlier patch.
>
> I'll try looking at that tomorrow. I tried adding some code to check for
> recursive calls to the fib-dump, and didn't see it ever hit, though
> the bug continued to happen readily.
>
> I just #if 0 the part between rcu-read-lock and read-unlock, and
> the problem went away..but of course you can't dump ipv6
> routes then...
>
> The actual logic to dump the fib is quite complex, full of
> opaque types and other stuff ripe for bugs. But, I don't see
> how it could cause the rcu splats in such a repeatable manner.
>
> The bug is always reported as being in the same place, so if
> there is any other debugging code you can think of to help
> shed light on this, I'll be happy to add it and give it a try.
> For instance, is there a way to dump (print) all current holders of
> the rcu_read_lock? I could call that before/during/after in that
> method and maybe get a clue.
I would guess that CONFIG_PROVE_RCU's use of lockdep would permit
listing all tasks holding rcu_read_lock(), as lockdep does maintain
that state in that case.
Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists