[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20161215184114.ut2ulhhflap5bfur@atomlin.usersys.redhat.com>
Date: Thu, 15 Dec 2016 18:41:15 +0000
From: Aaron Tomlin <atomlin@...hat.com>
To: Don Zickus <dzickus@...hat.com>
Cc: LKML <linux-kernel@...r.kernel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Ulrich Obergfell <uobergfe@...hat.com>
Subject: Re: [PATCH] kernel/watchdog: Prevent false hardlockup on overloaded
system
On Tue 2016-12-06 11:17 -0500, Don Zickus wrote:
> On an overloaded system, it is possible that a change in the watchdog threshold
> can be delayed long enough to trigger a false positive.
>
> This can easily be achieved by having a cpu spinning indefinitely on a task,
> while another cpu updates watchdog threshold.
>
> What happens is while trying to park the watchdog threads, the hrtimers on the
> other cpus trigger and reprogram themselves with the new slower watchdog
> threshold. Meanwhile, the nmi watchdog is still programmed with the old faster
> threshold.
>
> Because the one cpu is blocked, it prevents the thread parking on the other
> cpus from completing, which is needed to shutdown the nmi watchdog and
> reprogram it correctly. As a result, a false positive from the nmi watchdog is
> reported.
>
> Fix this by setting a park_in_progress flag to block all lockups
> until the parking is complete.
>
> Fix provided by Ulrich Obergfell.
>
> Cc: Ulrich Obergfell <uobergfe@...hat.com>
> Signed-off-by: Don Zickus <dzickus@...hat.com>
> ---
> include/linux/nmi.h | 1 +
> kernel/watchdog.c | 9 +++++++++
> kernel/watchdog_hld.c | 3 +++
> 3 files changed, 13 insertions(+)
Looks fine to me.
Reviewed-by: Aaron Tomlin <atomlin@...hat.com>
--
Aaron Tomlin
Powered by blists - more mailing lists