[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.20.1710022105570.2114@nanos>
Date: Mon, 2 Oct 2017 21:32:57 +0200 (CEST)
From: Thomas Gleixner <tglx@...utronix.de>
To: Linus Torvalds <torvalds@...ux-foundation.org>
cc: LKML <linux-kernel@...r.kernel.org>,
Ingo Molnar <mingo@...nel.org>,
"H. Peter Anvin" <hpa@...or.com>,
Peter Zijlstra <peterz@...radead.org>,
Don Zickus <dzickus@...hat.com>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Michael Ellerman <mpe@...erman.id.au>,
Nicholas Piggin <npiggin@...il.com>,
ppc-dev <linuxppc-dev@...ts.ozlabs.org>
Subject: Re: [RFC GIT Pull] core watchdog sanitizing
On Mon, 2 Oct 2017, Linus Torvalds wrote:
> Side note: would it perhaps make sense to have that
> cpus_read_lock/unlock() sequence around the whole reconfiguration
> section?
>
> Because while looking at that sequence, it looks a bit odd to me that
> cpu's can come and go in the middle of the nmi watchdog
> reconfiguration sequence.
>
> In particular, what happens if a new CPU is brought up just as the NMI
> matchdog is being reconfigured? The NMI's have been stopped for the
> old CPU's, what happens for the new one that came up in between that
> watchdog_nmi_stop/start?
>
> This may be all obviously safe, I'm just asking for clarification.
It's safe because the newly upcoming CPU will see an empty enabled mask in
the powerpc implementation. The perf based implementation has a similar
protection.
Though yes, it would be more obvious to expand the cpus locked
section. That requires a bit of shuffling. Untested patch below.
Thanks,
tglx
8<------------------
--- a/arch/powerpc/kernel/watchdog.c
+++ b/arch/powerpc/kernel/watchdog.c
@@ -359,21 +359,17 @@ void watchdog_nmi_stop(void)
{
int cpu;
- cpus_read_lock();
for_each_cpu(cpu, &wd_cpus_enabled)
stop_wd_on_cpu(cpu);
- cpus_read_unlock();
}
void watchdog_nmi_start(void)
{
int cpu;
- cpus_read_lock();
watchdog_calc_timeouts();
for_each_cpu_and(cpu, cpu_online_mask, &watchdog_cpumask)
start_wd_on_cpu(cpu);
- cpus_read_unlock();
}
/*
--- a/kernel/smpboot.c
+++ b/kernel/smpboot.c
@@ -351,7 +351,7 @@ void smpboot_update_cpumask_percpu_threa
static struct cpumask tmp;
unsigned int cpu;
- get_online_cpus();
+ lockdep_assert_cpus_held();
mutex_lock(&smpboot_threads_lock);
/* Park threads that were exclusively enabled on the old mask. */
@@ -367,7 +367,6 @@ void smpboot_update_cpumask_percpu_threa
cpumask_copy(old, new);
mutex_unlock(&smpboot_threads_lock);
- put_online_cpus();
}
static DEFINE_PER_CPU(atomic_t, cpu_hotplug_state) = ATOMIC_INIT(CPU_POST_DEAD);
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -535,7 +535,6 @@ static void softlockup_update_smpboot_th
smpboot_update_cpumask_percpu_thread(&watchdog_threads,
&watchdog_allowed_mask);
- __lockup_detector_cleanup();
}
/* Temporarily park all watchdog threads */
@@ -554,6 +553,7 @@ static void softlockup_unpark_threads(vo
static void softlockup_reconfigure_threads(void)
{
+ cpus_read_lock();
watchdog_nmi_stop();
softlockup_park_all_threads();
set_sample_period();
@@ -561,6 +561,12 @@ static void softlockup_reconfigure_threa
if (watchdog_enabled && watchdog_thresh)
softlockup_unpark_threads();
watchdog_nmi_start();
+ cpus_read_unlock();
+ /*
+ * Must be called outside the cpus locked section to prevent
+ * recursive locking in the perf code.
+ */
+ __lockup_detector_cleanup();
}
/*
Powered by blists - more mailing lists