[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAARzgwwXihtj+SnrBamPKJATnHYeuhDPcXGnD+a56OGD=p=qNQ@mail.gmail.com>
Date: Sat, 3 Jul 2021 23:00:23 +0530
From: Ani Sinha <ani@...sinha.ca>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: linux-kernel@...r.kernel.org, anirban.sinha@...ia.com,
Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
Frederic Weisbecker <fweisbec@...il.com>
Subject: Re: [PATCH v3] Add kernel logs when sched clock unstable and
NO_HZ_FULL is not possible
ping ...
On Sat, Jun 26, 2021 at 9:49 PM Ani Sinha <ani@...sinha.ca> wrote:
>
>
>
> On Tue, 22 Jun 2021, Thomas Gleixner wrote:
>
> > On Sun, Jun 20 2021 at 15:43, Ani Sinha wrote:
> >
> > > Commit 4f49b90abb4aca ("sched-clock: Migrate to use new tick
> > > dependency mask model") had also removed the kernel warning
> > > message informing the user that it was not possible to turn
> > > on NO_HZ_FULL. Adding back that log message here. It is
> > > unhelpful when the kernel turns off NO_HZ_FULL silently
> > > without informing anyone.
> > > Also added a kernel log when sched clock is marked as unstable.
> >
> > Don't do two things at once. See Documentation/process/....
>
> Ok I will split up this patch into two.
>
> >
> > Also your subject line want's a proper prefix.
>
> OK will fix.
>
> >
> > > diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c
> > > index c2b2859ddd82..9f9fe658f8a5 100644
> > > --- a/kernel/sched/clock.c
> > > +++ b/kernel/sched/clock.c
> > > @@ -192,8 +192,11 @@ void clear_sched_clock_stable(void)
> > >
> > > smp_mb(); /* matches sched_clock_init_late() */
> > >
> > > - if (static_key_count(&sched_clock_running.key) == 2)
> > > + if (static_key_count(&sched_clock_running.key) == 2) {
> > > + WARN_ONCE(sched_clock_stable(),
> > > + "sched clock is now marked unstable.");
> >
> > What's the WARN for here? That backtrace is largely uninteresting.
>
> OK to be consistent with other parts of the code, I will replace this with
> pr_warn()
>
> >
> > > - if (can_stop_full_tick(cpu, ts))
> > > + if (can_stop_full_tick(cpu, ts)) {
> > > tick_nohz_stop_sched_tick(ts, cpu);
> > > - else if (ts->tick_stopped)
> > > - tick_nohz_restart_sched_tick(ts, ktime_get());
> > > + } else {
> > > + /*
> > > + * Don't allow the user to think they can get
> > > + * full NO_HZ with this machine.
> > > + */
> > > + WARN_ONCE(tick_nohz_full_running,
> > > + "NO_HZ_FULL will not work for the current system.");
> >
> > can_stop_full_tick() returning false can be transient and then the user
> > still has no idea _why_ this is printed.
> >
> > Also assume the user/admin starts perf and knows he's going to disturb
> > NOHZ full, then _why_ would he be interested in that warning.
> >
>
> If NOHZ is disabled intentionally, that is not an interesting case. I am
> worried about the situation where the user specifies NOHZ option in the
> kernel commandline and the kernel silently disabled this because of one or
> more limitations in the system. I want to address this. All the reasons
> specified in the following commit is still true:
>
> commit e12d0271774fea9fddf1e2a7952a0bffb2ee8e8b
> Author: Steven Rostedt <rostedt@...dmis.org>
> Date: Fri May 10 17:12:28 2013 -0400
>
> nohz: Warn if the machine can not perform nohz_full
>
> If the user configures NO_HZ_FULL and defines nohz_full=XXX on the
> kernel command line, or enables NO_HZ_FULL_ALL, but nohz fails
> due to the machine having a unstable clock, warn about it.
>
> We do not want users thinking that they are getting the benefit
> of nohz when their machine can not support it.
>
> Signed-off-by: Steven Rostedt <rostedt@...dmis.org>
> Cc: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
> Cc: Ingo Molnar <mingo@...nel.org>
> Cc: Andrew Morton <akpm@...ux-foundation.org>
> Cc: Thomas Gleixner <tglx@...utronix.de>
> Cc: H. Peter Anvin <hpa@...or.com>
> Cc: Peter Zijlstra <peterz@...radead.org>
> Cc: Borislav Petkov <bp@...en8.de>
> Cc: Li Zhong <zhong@...ux.vnet.ibm.com>
> Signed-off-by: Frederic Weisbecker <fweisbec@...il.com>
>
> This log was removed from the kernel in commit
>
> commit 4f49b90abb4aca6fe677c95fc352fd0674d489bd
> Author: Frederic Weisbecker <fweisbec@...il.com>
> Date: Wed Jul 22 17:03:52 2015 +0200
>
> sched-clock: Migrate to use new tick dependency mask model
>
> Instead of checking sched_clock_stable from the nohz subsystem to
> verify
> its tick dependency, migrate it to the new mask in order to include it
> to the all-in-one check.
>
>
> and it seems to me that the removal of the log defeats the purpose that
> commit e12d0271774fea9fddf tried to serve.
>
> Please enlighten me as to how to achieve this and I will cook up
> something.
>
> Ani
Powered by blists - more mailing lists