lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAARzgwwXihtj+SnrBamPKJATnHYeuhDPcXGnD+a56OGD=p=qNQ@mail.gmail.com>
Date:   Sat, 3 Jul 2021 23:00:23 +0530
From:   Ani Sinha <ani@...sinha.ca>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     linux-kernel@...r.kernel.org, anirban.sinha@...ia.com,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Frederic Weisbecker <fweisbec@...il.com>
Subject: Re: [PATCH v3] Add kernel logs when sched clock unstable and
 NO_HZ_FULL is not possible

ping ...

On Sat, Jun 26, 2021 at 9:49 PM Ani Sinha <ani@...sinha.ca> wrote:
>
>
>
> On Tue, 22 Jun 2021, Thomas Gleixner wrote:
>
> > On Sun, Jun 20 2021 at 15:43, Ani Sinha wrote:
> >
> > > Commit 4f49b90abb4aca ("sched-clock: Migrate to use new tick
> > > dependency mask model") had also removed the kernel warning
> > > message informing the user that it was not possible to turn
> > > on NO_HZ_FULL. Adding back that log message here. It is
> > > unhelpful when the kernel turns off NO_HZ_FULL silently
> > > without informing anyone.
> > > Also added a kernel log when sched clock is marked as unstable.
> >
> > Don't do two things at once. See Documentation/process/....
>
> Ok I will split up this patch into two.
>
> >
> > Also your subject line want's a proper prefix.
>
> OK will fix.
>
> >
> > > diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c
> > > index c2b2859ddd82..9f9fe658f8a5 100644
> > > --- a/kernel/sched/clock.c
> > > +++ b/kernel/sched/clock.c
> > > @@ -192,8 +192,11 @@ void clear_sched_clock_stable(void)
> > >
> > >     smp_mb(); /* matches sched_clock_init_late() */
> > >
> > > -   if (static_key_count(&sched_clock_running.key) == 2)
> > > +   if (static_key_count(&sched_clock_running.key) == 2) {
> > > +           WARN_ONCE(sched_clock_stable(),
> > > +                     "sched clock is now marked unstable.");
> >
> > What's the WARN for here? That backtrace is largely uninteresting.
>
> OK to be consistent with other parts of the code, I will replace this with
> pr_warn()
>
> >
> > > -   if (can_stop_full_tick(cpu, ts))
> > > +   if (can_stop_full_tick(cpu, ts)) {
> > >             tick_nohz_stop_sched_tick(ts, cpu);
> > > -   else if (ts->tick_stopped)
> > > -           tick_nohz_restart_sched_tick(ts, ktime_get());
> > > +   } else {
> > > +           /*
> > > +            * Don't allow the user to think they can get
> > > +            * full NO_HZ with this machine.
> > > +            */
> > > +           WARN_ONCE(tick_nohz_full_running,
> > > +                     "NO_HZ_FULL will not work for the current system.");
> >
> > can_stop_full_tick() returning false can be transient and then the user
> > still has no idea _why_ this is printed.
> >
> > Also assume the user/admin starts perf and knows he's going to disturb
> > NOHZ full, then _why_ would he be interested in that warning.
> >
>
> If NOHZ is disabled intentionally, that is not an interesting case. I am
> worried about the situation where the user specifies NOHZ option in the
> kernel commandline and the kernel silently disabled this because of one or
> more limitations in the system. I want to address this. All the reasons
> specified in the following commit is still true:
>
> commit e12d0271774fea9fddf1e2a7952a0bffb2ee8e8b
> Author: Steven Rostedt <rostedt@...dmis.org>
> Date:   Fri May 10 17:12:28 2013 -0400
>
>     nohz: Warn if the machine can not perform nohz_full
>
>     If the user configures NO_HZ_FULL and defines nohz_full=XXX on the
>     kernel command line, or enables NO_HZ_FULL_ALL, but nohz fails
>     due to the machine having a unstable clock, warn about it.
>
>     We do not want users thinking that they are getting the benefit
>     of nohz when their machine can not support it.
>
>     Signed-off-by: Steven Rostedt <rostedt@...dmis.org>
>     Cc: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
>     Cc: Ingo Molnar <mingo@...nel.org>
>     Cc: Andrew Morton <akpm@...ux-foundation.org>
>     Cc: Thomas Gleixner <tglx@...utronix.de>
>     Cc: H. Peter Anvin <hpa@...or.com>
>     Cc: Peter Zijlstra <peterz@...radead.org>
>     Cc: Borislav Petkov <bp@...en8.de>
>     Cc: Li Zhong <zhong@...ux.vnet.ibm.com>
>     Signed-off-by: Frederic Weisbecker <fweisbec@...il.com>
>
> This log was removed from the kernel in commit
>
> commit 4f49b90abb4aca6fe677c95fc352fd0674d489bd
> Author: Frederic Weisbecker <fweisbec@...il.com>
> Date:   Wed Jul 22 17:03:52 2015 +0200
>
>     sched-clock: Migrate to use new tick dependency mask model
>
>     Instead of checking sched_clock_stable from the nohz subsystem to
> verify
>     its tick dependency, migrate it to the new mask in order to include it
>     to the all-in-one check.
>
>
> and it seems to me that the removal of the log defeats the purpose that
> commit e12d0271774fea9fddf tried to serve.
>
> Please enlighten me as to how to achieve this and I will cook up
> something.
>
> Ani

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ