linux-kernel - Re: [PATCH v2] watchdog: nohz: don't run watchdog on nohz

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20150402151950.GB10357@lerouge>
Date:	Thu, 2 Apr 2015 17:19:51 +0200
From:	Frederic Weisbecker <fweisbec@...il.com>
To:	Don Zickus <dzickus@...hat.com>
Cc:	Chris Metcalf <cmetcalf@...hip.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Andrew Jones <drjones@...hat.com>,
	chai wen <chaiw.fnst@...fujitsu.com>,
	Ingo Molnar <mingo@...nel.org>,
	Ulrich Obergfell <uobergfe@...hat.com>,
	Fabian Frederick <fabf@...net.be>,
	Aaron Tomlin <atomlin@...hat.com>,
	Ben Zhang <benzh@...omium.org>,
	Christoph Lameter <cl@...ux.com>,
	Gilad Ben-Yossef <gilad@...yossef.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	open list <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] watchdog: nohz: don't run watchdog on nohz_full cores

On Mon, Mar 30, 2015 at 04:02:06PM -0400, Don Zickus wrote:
> On Mon, Mar 30, 2015 at 03:32:55PM -0400, Chris Metcalf wrote:
> > On 03/30/2015 03:12 PM, Don Zickus wrote:
> > >On Mon, Mar 30, 2015 at 02:51:05PM -0400, cmetcalf@...hip.com wrote:
> > >>From: Chris Metcalf <cmetcalf@...hip.com>
> > >>
> > >>Running watchdog can be a helpful debugging feature on regular
> > >>cores, but it's incompatible with nohz_full, since it forces
> > >>regular scheduling events.  Accordingly, just exit out immediately
> > >>from any nohz_full core.
> > >>
> > >>An alternate approach would be to add a flags field or function to
> > >>smp_hotplug_thread to control on which cores the percpu threads
> > >>are created, but it wasn't clear that much mechanism was useful.
> > >Hi Chris,
> > >
> > >It seems like the correct solution would be to hook into the idle_loop
> > >somehow.  If the cpu is idle, then it seems unlikely that a lockup could
> > >occur.
> > 
> > With nohz_full, though, the cpu might be running userspace code
> > with the intention of keeping kernel ticks disabled.  Even returning
> > to kernel mode to try to figure out if we "should" be running the
> > watchdog on a given core will induce exactly the kind of interrupts
> > that nohz_full is designed to prevent.
> > 
> > My assumption is generally that nohz_full cores don't spend a lot of
> > time in the kernel anyway, as they are optimized for user space.
> > 
> > I guess you could imagine doing something per-cpu on the nohz_full
> > cores where we effectively call watchdog_disable() whenever a
> > nohz_full core enters userspace, and watchdog_enable() whenever it
> > enters the kernel.  We could add some per-cpu state in the watchdog
> > code to track whether that core was currently enabled or disabled
> > to avoid double-enabling or double-disabling.  I would think
> > context_tracking_user_exit()/_enter() would be the place to do this.
> > 
> > This feels like a lot of overhead, potentially.  Thoughts?
> 
> A few months ago I might have thought that a reasonable approach.  But
> recently we have added code to make the watchdog an all or nothing approach
> across the system.  This might make it difficult to do what you are
> suggesting.
> 
> I do not know enough about the nohz code to know what the right approach is
> here.  Perhaps Federic can enlighten me?

Well, cancelling/rearming a timer on every userspace round trip sounds way too
much overhead to me :-)

But Ingo's suggestion to disable it properly (only on nohz full core) looks good.
And we should be able to re-enable it everywhere with "sysctl -w kernel.watchdog=1"
and you need to warn about this on boot.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/