[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20190123171106.GA17710@lenoir>
Date: Wed, 23 Jan 2019 18:11:08 +0100
From: Frederic Weisbecker <frederic@...nel.org>
To: Nicholas Piggin <npiggin@...il.com>
Cc: Frederic Weisbecker <fweisbec@...il.com>,
linux-kernel@...r.kernel.org, Michael Neuling <mikey@...ling.org>,
Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [RFC PATCH] time/nohz: allow the boot CPU to be nohz_full
On Wed, Jan 23, 2019 at 06:25:26PM +1000, Nicholas Piggin wrote:
> Frederic Weisbecker's on January 17, 2019 3:54 am:
> > On Mon, Jan 14, 2019 at 04:47:45PM +1000, Nicholas Piggin wrote:
> >> We have a supercomputer site testing nohz_full to reduce jitter with
> >> good results, but they want CPU0 to be nohz_full. That happens to be
> >> the boot CPU, which is disallowed by the nohz_full code.
> >>
> >> They have existing job scheduling code which wants this, I don't know
> >> too much detail beyond that, but I hope the kernel can be made to
> >> work with their config.
> >>
> >> This patch has the boot CPU take over the jiffies update in the low
> >> res timer before SMP is brought up, after which the nohz CPU will take
> >> over.
> >>
> >> It also modifies the housekeeping check code a bit to ensure at least
> >> one !nohz CPU is in the present map so it comes up at boot, rather
> >> than having the nohz code take the boot CPU out of the nohz mask.
> >>
> >> This keeps jiffies incrementing on the nohz_full boot CPU before SMP
> >> init, but I'm not sure if this is covering all races and platform
> >> considerations. Sorry I don't know the timer code too well, I would
> >> appreciate any help.
> >>
> >> Thanks,
> >> Nick
> >
> > We used to allow that and that broke hibernation :)
>
> Oh interesting to know thanks, I'll look up the old code.
>
> > So, since we need to have at least one CPU alive to handle the
> > timekeeping updates on behalf of nohz CPUs, we forbid it to go idle
> > and offline, for simplicity. Now hibernation requires to disable
> > non-boot CPUs. So if the timekeeper is not the boot CPU, it's going to
> > refuse the hotplug operation and break hibernation.
>
> Simplest would be just to make them mutually exclusive. I don't think
> this customer needs hibernation.
That's something I can recommend out of tree. Of course upstream we can't break
a feature for a new one.
>
> In the longer run, I wonder if it would be nice to allow CPUs to change
> in and out of nohz-full mode at runtime, and the time keeper CPU to be
> able to be migrated at runtime like it does for nohz idle. Maybe that's
> over engineering things if there is no real demand for it though.
Indeed the plan is to be able to dynamically switch to/from nohz_full
on runtime, through cpuset for example. CPU 0 may stay the exception though.
Now, housekeepers (ie: CPUs that are not nohz_full) need to be many on machines
with big number of CPUs. With this kind of scenario in mind we could arrange for
allowing the migration of the timekeeping duty. But then again, that's an invasive
change. Like you just said, so far I haven't heard of real demand, you're the
first one :-)
Powered by blists - more mailing lists