[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110207102950.GA17044@brick.ozlabs.ibm.com>
Date: Mon, 7 Feb 2011 21:29:50 +1100
From: Paul Mackerras <paulus@...ba.org>
To: David Miller <davem@...emloft.net>
Cc: Knut_Petersen@...nline.de, linux-kernel@...r.kernel.org,
mostrows@...thlink.net, linux-ppp@...r.kernel.org
Subject: Re: [BUG] 2.6.38-rc2: Circular Locking Dependency
On Sun, Feb 06, 2011 at 11:28:56PM -0800, David Miller wrote:
> From: Knut Petersen <Knut_Petersen@...nline.de>
> Date: Mon, 24 Jan 2011 10:25:55 +0100
>
> > As I was hunting something different I found the following (potential)
> > problem on an openSuSE 11.3 system with kernel 2.6.38-rc2.
> > The message is triggerd by smpppd starting a dsl connection.
> >
> > Knut
> >
> >
> > NET: Registered protocol family 24
> >
> > =======================================================
> > [ INFO: possible circular locking dependency detected ]
> > 2.6.38-rc2-kape #7
> > -------------------------------------------------------
> > pppd/2529 is trying to acquire lock:
> > (&(&pch->downl)->rlock){+.....}, at: [<f814a634>] ppp_push+0x59/0x4a8
> > [ppp_generic]
> >
> > but task is already holding lock:
> > (&(&ppp->wlock)->rlock){+.-...}, at: [<f814ae1b>]
> > ppp_xmit_process+0x19/0x451 [ppp_generic]
> >
> > which lock already depends on the new lock.
>
> I've stared over this trace several times and can't figure out what
> the problem is.
>
> Paul, any idea?
We seem to have recursed in the ppp code because of (apparently)
handling a softirq inside a spin_lock_bh region. :( If I understand
the original report correctly, the stack trace looks like this in part:
[<c04153eb>] net_rx_action+0x3f/0xfe
[<c0128563>] __do_softirq+0x76/0xfd
-> #1 (_xmit_NETROM){+.-...}:
[<c01462b2>] lock_acquire+0x47/0x5e
[<c0471c9c>] _raw_spin_lock_irqsave+0x2e/0x3e
[<c040ed60>] skb_dequeue+0x12/0x4a
[<f814c237>] ppp_channel_push+0x2e/0x94 [ppp_generic]
So we were in ppp_channel_push, and the first thing it does is
spin_lock_bh(&pch->downl), and then it calls skb_dequeue, which did a
spin_lock_irqsave, and then somehow we get into __do_softirq. I
thought spin_lock_bh should have stopped softirqs from running?
Paul.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists