[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20070212064446.GA1651@ff.dom.local>
Date: Mon, 12 Feb 2007 07:44:46 +0100
From: Jarek Poplawski <jarkao2@...pl>
To: Stephen Hemminger <shemminger@...ux-foundation.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>, netdev@...r.kernel.org,
"bugme-daemon\@kernel-bugs\.osdl\.org"
<bugme-daemon@...zilla.kernel.org>, pterjan@...il.com
Subject: Re: [Bugme-new] [Bug 7962] New: oops in port_carrier_check
On Fri, Feb 09, 2007 at 09:52:04AM -0800, Stephen Hemminger wrote:
> On Fri, 9 Feb 2007 08:42:11 +0100
> Jarek Poplawski <jarkao2@...pl> wrote:
>
> > On 07-02-2007 23:09, Stephen Hemminger wrote:
> > > On Wed, 7 Feb 2007 12:52:16 -0800
> > > Andrew Morton <akpm@...ux-foundation.org> wrote:
> > ...
> > >> Feb 7 21:20:18 plop kernel: BUG: unable to handle kernel paging request at
> > >> virtual address 6b6b6b6b
> > >> Feb 7 21:20:18 plop kernel: printing eip:
> > >> Feb 7 21:20:18 plop kernel: *pde = 00000000
> > >> Feb 7 21:20:18 plop kernel: Oops: 0000 [#1]
> > >> Feb 7 21:20:18 plop kernel: CPU: 0
> > >> Feb 7 21:20:19 plop kernel: EIP: 0060:[pg0+814360305/1067136000] Not
> > >> tainted VLI
> > >> Feb 7 21:20:19 plop kernel: EIP: 0060:[<f0eed6f1>] Not tainted VLI
> > >> Feb 7 21:20:19 plop kernel: EFLAGS: 00010202 (2.6.20.0.rc7-1mdv #1)
> > >> Feb 7 21:20:19 plop kernel: EIP is at port_carrier_check+0x22/0x75 [bridge]
> > >> Feb 7 21:20:19 plop kernel: eax: 6b6b6b6b ebx: 6b6b6b6b ecx: 00000000
> >
> > I think it's caused by pending delayed workqueue
> > trying to use dev after kfree (POISON_FREE in eax, ebx).
> >
> > > static void port_carrier_check(struct work_struct *work)
> > > {
> > > struct net_bridge_port *p;
> > > struct net_device *dev;
> > > struct net_bridge *br;
> > >
> > > dev = container_of(work, struct net_bridge_port,
> > > carrier_check.work)->dev;
> > > work_release(work);
> > >
> > > rtnl_lock();
> > > p = dev->br_port;
> > > if (!p)
> > > goto done;
> > > br = p->br;
> > >
> > > if (netif_carrier_ok(dev))
> > > p->path_cost = port_cost(dev);
> > >
> > > if (br->dev->flags & IFF_UP) {
> >
> > My investigation seems to point at this line (p == ebx
> > but not NULL because of mem debugging on, probably).
Sorry, I overpasted. This is the line:
--> br = p->br;
> The carrier_check is canceled by removal of port from bridge.
> Perhaps there is something broken in rcu assumptions under Qemu
If you mean this:
> static void del_nbp(struct net_bridge_port *p)
> {
> ...
> cancel_delayed_work(&p->carrier_check);
it's not sufficient. According to workqueue.h:
> /*
> * Kill off a pending schedule_delayed_work(). Note that the work callback
> * function may still be running on return from cancel_delayed_work(). Run
> * flush_scheduled_work() to wait on it.
> */
> static inline int cancel_delayed_work(struct delayed_work *work)
I can't see how rcu could help here with this pointer
to dev passed on to delayed_work (out of any rcu block).
IMHO dev_hold/dev_put (or something alike) is needed here.
Regards,
Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists