[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20130406034750.GA25339@roeck-us.net>
Date: Fri, 5 Apr 2013 20:47:50 -0700
From: Guenter Roeck <linux@...ck-us.net>
To: Arkadiusz Miskiewicz <a.miskiewicz@...il.com>
Cc: Wim Van Sebroeck <wim@...ana.be>, linux-watchdog@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: 3.8.3 and 3.9git occasional watchdog oops
On Thu, Apr 04, 2013 at 06:59:59PM -0700, Guenter Roeck wrote:
> On Fri, Apr 05, 2013 at 12:23:30AM +0200, Arkadiusz Miskiewicz wrote:
> > On Thursday 14 of March 2013, Arkadiusz MiĆkiewicz wrote:
> > > Hi.
> > >
> > > Just hit watchdog related oops in 3.8.3 kernel. Unfortunately photos only.
> > >
> > > http://ixion.pld-linux.org/~arekm/watchdog-oops-3.8.3/IMG_8942.JPG
> > > http://ixion.pld-linux.org/~arekm/watchdog-oops-3.8.3/IMG_8941.JPG
> >
> > 3.9git from today isn't any better unfortunately:
> >
> > http://ixion.pld-linux.org/~arekm/watchdog-oops-3.9git.jpg
> >
> > >
> > > oops started after I enabled systemd watchdog functionality. Cannot
> > > reproduce easily.
> > >
> > > watchdog here (thinkpad t400) is:
> > > iTCO_wdt: Found a ICH9M-E TCO device (Version=2, TCOBASE=0x1060)
> >
> >
> Wonder if there is a race condition in the watchdog driver: The watchdog device
> is opened before watchdog_register_device returns. I suspect systemd waits for
> a udev event, or by some other means detects that /dev/watchdog was created,
> and opens it immediately.
>
> I just have no idea where exactly the race condition, if there is one, is
> hiding. Or maybe I am completely off track.
>
I _think_ I understand the sequence of events.
- The driver is the first watchdog driver to register.
- watchdog_dev_register() gets called and creates the watchdog misc device
by calling misc_register().
At that time, the matching character device (/dev/watchdog0) does not yet
exist, and old_wdd is not set either.
- Userspace gets an event and opens /dev/watchdog
- watchdog_open() is called and sets sets wdd = old_wdd, which is still NULL,
and tries to dereference it. Bang.
If this is the problem, a simple fix would be to set old_wdd before calling
misc_register().
Can you test a patch ?
Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists