[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <8b817de5-f9fa-c794-07db-eaed1aa918c3@oracle.com>
Date: Thu, 13 Oct 2016 15:37:21 -0500
From: Babu Moger <babu.moger@...cle.com>
To: Don Zickus <dzickus@...hat.com>
Cc: mingo@...nel.org, akpm@...ux-foundation.org, ak@...ux.intel.com,
jkosina@...e.cz, baiyaowei@...s.chinamobile.com,
atomlin@...hat.com, uobergfe@...hat.com, tj@...nel.org,
hidehiro.kawai.ez@...achi.com, johunt@...mai.com,
davem@...emloft.net, sparclinux@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/2] Introduce update_arch_nmi_watchdog for arch specific
handlers
On 10/7/2016 10:51 AM, Don Zickus wrote:
> On Thu, Oct 06, 2016 at 03:16:41PM -0700, Babu Moger wrote:
>> During our testing we noticed that nmi watchdogs in sparc could not be disabled or
>> enabled dynamically using sysctl/proc interface. Sparc uses its own arch specific
>> nmi watchdogs. There is a sysctl and proc interface(proc/sys/kernel/nmi_watchdog)
>> to enable/disable nmi watchdogs. However, that is not working for sparc. There
>> is no interface to feed this parameter to arch specific nmi watchdogs.
>>
>> These patches extend the same sysctl/proc interface to enable or disable
>> these arch specific nmi watchdogs dynamically. Introduced new function
>> update_arch_nmi_watchdog which can be implemented in arch specific handlers.
>> If you think there is a better way to do this. Please advice.
>>
>> Tested on sparc. Compile tested on x86.
> Hi Babu,
>
> Thanks for the patch. Yeah, I don't test sparc at all (lack of hardware).
> Sorry about that.
>
> We did spend quite a bit of time trying to get various soft/hard lockup
> logic going for the /proc stuff and I am wondering if your patches are to
> simple and expose some of the races we tried to fix.
>
> Therefore I am wondering if we could re-use some of our logic for your case.
>
> The perf stuff is really the x86 equivalent of arch_watchdog_enable. I am
> wondering if we break that out as a __weak default function and then have
> sparc override it with its own enable/disable functions. Something along
> the lines below (compiled on x86 but untested)?
Hi Don, Sorry for the late response. I ran into issues with the setups
and new approach.
I could not use your patches as is. Reason is sparc does not define
CONFIG_HARDLOCKUP_DETECTOR.
So, defining default __weak function did not work for me. However, I
have used your idea to
define __weak functions arch_watchdog_nmi_enable and
arch_watchdog_nmi_disable when
CONFIG_HARDLOCKUP_DETECTOR is not defined. Sending v2 version now.
Please take a look. Thanks for your inputs.
>
> Cheers,
> Don
>
>
> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index 9acb29f..55cd2d3 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -585,15 +585,11 @@ static void watchdog(unsigned int cpu)
> */
> static unsigned long cpu0_err;
>
> -static int watchdog_nmi_enable(unsigned int cpu)
> +int __weak arch_watchdog_nmi_enable(unsigned int cpu)
> {
> struct perf_event_attr *wd_attr;
> struct perf_event *event = per_cpu(watchdog_ev, cpu);
>
> - /* nothing to do if the hard lockup detector is disabled */
> - if (!(watchdog_enabled & NMI_WATCHDOG_ENABLED))
> - goto out;
> -
> /* is it already setup and enabled? */
> if (event && event->state > PERF_EVENT_STATE_OFF)
> goto out;
> @@ -619,18 +615,6 @@ static int watchdog_nmi_enable(unsigned int cpu)
> goto out_save;
> }
>
> - /*
> - * Disable the hard lockup detector if _any_ CPU fails to set up
> - * set up the hardware perf event. The watchdog() function checks
> - * the NMI_WATCHDOG_ENABLED bit periodically.
> - *
> - * The barriers are for syncing up watchdog_enabled across all the
> - * cpus, as clear_bit() does not use barriers.
> - */
> - smp_mb__before_atomic();
> - clear_bit(NMI_WATCHDOG_ENABLED_BIT, &watchdog_enabled);
> - smp_mb__after_atomic();
> -
> /* skip displaying the same error again */
> if (cpu > 0 && (PTR_ERR(event) == cpu0_err))
> return PTR_ERR(event);
> @@ -658,7 +642,36 @@ out:
> return 0;
> }
>
> -static void watchdog_nmi_disable(unsigned int cpu)
> +static int watchdog_nmi_enable(unsigned int cpu)
> +{
> + int err;
> +
> + /* nothing to do if the hard lockup detector is disabled */
> + if (!(watchdog_enabled & NMI_WATCHDOG_ENABLED))
> + return 0;
> +
> + err = arch_watchdog_nmi_enable(cpu);
> +
> + if (err) {
> + /*
> + * Disable the hard lockup detector if _any_ CPU fails to set up
> + * set up the hardware perf event. The watchdog() function checks
> + * the NMI_WATCHDOG_ENABLED bit periodically.
> + *
> + * The barriers are for syncing up watchdog_enabled across all the
> + * cpus, as clear_bit() does not use barriers.
> + */
> + smp_mb__before_atomic();
> + clear_bit(NMI_WATCHDOG_ENABLED_BIT, &watchdog_enabled);
> + smp_mb__after_atomic();
> +
> + return err;
> + }
> +
> + return 0;
> +}
> +
> +void __weak arch_watchdog_nmi_disable(unsigned int cpu)
> {
> struct perf_event *event = per_cpu(watchdog_ev, cpu);
>
> @@ -675,6 +688,11 @@ static void watchdog_nmi_disable(unsigned int cpu)
> }
> }
>
> +static void watchdog_nmi_disable(unsigned int cpu)
> +{
> + arch_watchdog_nmi_disable(cpu);
> +}
> +
> #else
> static int watchdog_nmi_enable(unsigned int cpu) { return 0; }
> static void watchdog_nmi_disable(unsigned int cpu) { return; }
Powered by blists - more mailing lists