lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 1 Nov 2017 13:32:45 -0700
From:   Guenter Roeck <linux@...ck-us.net>
To:     mingo@...nel.org, linux-kernel@...r.kernel.org,
        peterz@...radead.org, tglx@...utronix.de, dzickus@...hat.com,
        hpa@...or.com
Cc:     linux-tip-commits@...r.kernel.org
Subject: Re: [tip:core/urgent] watchdog/harclockup/perf: Revert a33d44843d45
 ("watchdog/hardlockup/perf: Simplify deferred event destroy")

On Wed, Nov 01, 2017 at 12:46:00PM -0700, tip-bot for Thomas Gleixner wrote:
> Commit-ID:  1c294733b7b9f712f78d15cfa75ffdea72b79abb
> Gitweb:     https://git.kernel.org/tip/1c294733b7b9f712f78d15cfa75ffdea72b79abb
> Author:     Thomas Gleixner <tglx@...utronix.de>
> AuthorDate: Tue, 31 Oct 2017 22:32:00 +0100
> Committer:  Thomas Gleixner <tglx@...utronix.de>
> CommitDate: Wed, 1 Nov 2017 20:41:27 +0100
> 
> watchdog/harclockup/perf: Revert a33d44843d45 ("watchdog/hardlockup/perf: Simplify deferred event destroy")
> 
> Guenter reported a crash in the watchdog/perf code, which is caused by
> cleanup() and enable() running concurrently. The reason for this is:
> 
> The watchdog functions are serialized via the watchdog_mutex and cpu
> hotplug locking, but the enable of the perf based watchdog happens in
> context of the unpark callback of the smpboot thread. But that unpark
> function is not synchronous inside the locking. The unparking of the thread
> just wakes it up and leaves so there is no guarantee when the thread is
> executing.
> 
> If it starts running _before_ the cleanup happened then it will create a
> event and overwrite the dead event pointer. The new event is then cleaned
> up because the event is marked dead.
> 
>     lock(watchdog_mutex);
>     lockup_detector_reconfigure();
>         cpus_read_lock();
> 	stop();
> 	   park()
> 	update();
> 	start();
> 	   unpark()
> 	cpus_read_unlock();		thread runs()
> 					  overwrite dead event ptr
> 	cleanup();
> 	  free new event, which is active inside perf....
>     unlock(watchdog_mutex);
> 
> The park side is safe as that actually waits for the thread to reach
> parked state.
> 
> Commit a33d44843d45 removed the protection against this kind of scenario
> under the stupid assumption that the hotplug serialization and the
> watchdog_mutex cover everything. 
> 
> Bring it back.
> 
> Reverts: a33d44843d45 ("watchdog/hardlockup/perf: Simplify deferred event destroy")
> Reported-and-tested-by: Guenter Roeck <linux@...ck-us.net>
> Signed-off-by: Thomas Feels-stupid Gleixner <tglx@...utronix.de>
> Cc: Peter Zijlstra <peterz@...radead.org>
> Cc: Don Zickus <dzickus@...hat.com>
> Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1710312145190.1942@nanos
> 
> ---
>  kernel/watchdog_hld.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c
> index 71a62ce..f8db56b 100644
> --- a/kernel/watchdog_hld.c
> +++ b/kernel/watchdog_hld.c
> @@ -21,6 +21,7 @@
>  static DEFINE_PER_CPU(bool, hard_watchdog_warn);
>  static DEFINE_PER_CPU(bool, watchdog_nmi_touch);
>  static DEFINE_PER_CPU(struct perf_event *, watchdog_ev);
> +static DEFINE_PER_CPU(struct perf_event *, dead_event);
>  static struct cpumask dead_events_mask;
>  
>  static unsigned long hardlockup_allcpu_dumped;
> @@ -203,6 +204,8 @@ void hardlockup_detector_perf_disable(void)
>  
>  	if (event) {
>  		perf_event_disable(event);
> +		this_cpu_write(watchdog_ev, NULL);
> +		this_cpu_write(dead_event, event);
>  		cpumask_set_cpu(smp_processor_id(), &dead_events_mask);
>  		watchdog_cpus--;
>  	}
> @@ -218,7 +221,7 @@ void hardlockup_detector_perf_cleanup(void)
>  	int cpu;
>  
>  	for_each_cpu(cpu, &dead_events_mask) {
> -		struct perf_event *event = per_cpu(watchdog_ev, cpu);
> +		struct perf_event *event = per_cpu(dead_event, cpu);
>  
>  		/*
>  		 * Required because for_each_cpu() reports  unconditionally
> @@ -226,7 +229,7 @@ void hardlockup_detector_perf_cleanup(void)
>  		 */
>  		if (event)
>  			perf_event_release_kernel(event);
> -		per_cpu(watchdog_ev, cpu) = NULL;
> +		per_cpu(dead_event_ev, cpu) = NULL;

Uuhh ... there is an extra _ev which somehow slipped in and doesn't
compile.

Guenter

>  	}
>  	cpumask_clear(&dead_events_mask);
>  }

Powered by blists - more mailing lists