[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.20.1709161850010.2105@nanos>
Date: Sat, 16 Sep 2017 19:35:10 +0200 (CEST)
From: Thomas Gleixner <tglx@...utronix.de>
To: Fengguang Wu <fengguang.wu@...el.com>
cc: LKP <lkp@...org>, LKML <linux-kernel@...r.kernel.org>,
Don Zickus <dzickus@...hat.com>,
Ingo Molnar <mingo@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Linus Torvalds <torvalds@...ux-foundation.org>
Subject: Re: d57108d4f6 ("watchdog/core: Get rid of the thread .."): BUG:
unable to handle kernel NULL pointer dereference at 0000000000000208
On Sat, 16 Sep 2017, Fengguang Wu wrote:
> > > [ 0.038086] Performance Events: unsupported p6 CPU model 61 no PMU
> > > driver, software events only.
>
> What's your host CPU? I can reproduce it in Nehalem, Haswell and Sandy
> Bridge machines with the attached script.
My bad. I booted the wrong config ....
> > > [ 0.041031] Hierarchical SRCU implementation.
> > > [ 0.046210] NMI watchdog: Perf event create on CPU 0 failed with -2
> > > [ 0.046980] NMI watchdog: Perf NMI watchdog permanetely disabled
> > >
> > > Confused
> >
> > I still can't reproduce. Can you please apply the debug patch below and
> > provide the output?
>
> OK. I'll try and report back tomorrow.
Don't bother. I found it already. On UP we have:
#define for_each_cpu(cpu, mask) \
for ((cpu) = 0; (cpu) < 1; (cpu)++, (void)mask)
which is a total fail as it breaks any code which uses for_each_cpu() or
any of the other variants on UP by assuming that all cpumask have bit 0
set.
That means any code which does not have conditional code for some of the
cpumask functions is potentially broken. Sigh.
The simple cure for the watchdog is below.
Thanks,
tglx
8<------------------
diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c
index b2931154b5f2..d4c0f75b189e 100644
--- a/kernel/watchdog_hld.c
+++ b/kernel/watchdog_hld.c
@@ -221,7 +221,12 @@ void hardlockup_detector_perf_cleanup(void)
struct perf_event *event = per_cpu(watchdog_ev, cpu);
per_cpu(watchdog_ev, cpu) = NULL;
- perf_event_release_kernel(event);
+ /*
+ * Check the event, because on UP for_each_cpu() assumes
+ * idiotically that all masks handed in have bit 0 set.
+ */
+ if (event)
+ perf_event_release_kernel(event);
}
cpumask_clear(&dead_events_mask);
}
Powered by blists - more mailing lists