lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210428135725.GN975577@paulmck-ThinkPad-P17-Gen-1>
Date:   Wed, 28 Apr 2021 06:57:25 -0700
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Luming Yu <luming.yu@...il.com>
Cc:     Andi Kleen <ak@...ux.intel.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>, john.stultz@...aro.org,
        sboyd@...nel.org, corbet@....net, Mark.Rutland@....com,
        maz@...nel.org, kernel-team@...com, neeraju@...eaurora.org,
        feng.tang@...el.com, zhengjun.xing@...el.com,
        Chris Mason <clm@...com>
Subject: Re: [PATCH v10 clocksource 1/7] clocksource: Provide module
 parameters to inject delays in watchdog

On Wed, Apr 28, 2021 at 12:49:12PM +0800, Luming Yu wrote:
> We 'd expect to see clock_source watchdog can avoid to do wrong thing
> due to the injected delay or
> in real life delay by doing tsc sync-re-check by applying the patch-set.
> However , the noise is still cause wrong actions and the patch doesn't
> defeat the injected's delay
> please correct me if  I'm wrong.

Injecting delay is just a test.  In real life, if you got four delays
in a row, the cause is likely that the clock read is broken and taking
a very long time.  In which case marking that clock unstable is a
reasonable response.

Other causes include having an NMI or SMI storm, getting extremely
unlucky with vCPU preemptions, and so on.  In these cases, you are not
making much forward progress anyway, so marked-unstable clock is the
least of your worries.

I ran this (without injected delays) for more than a thousand hours on
rcutorture guest OSes and didn't see any instances of even two consecutive
bad reads.   There was the very occasional single instance of a bad read.

Therefore, the code marks the clock unstable if it has four bad reads
in a row, as it should.

							Thanx, Paul

> parameters]# cat *
> 1
> 1
> -1
> 3
> 8
> 
> [62939.809615] clocksource: clocksource_watchdog_inject_delay():
> Injecting delay.
> [62939.816867] clocksource: clocksource_watchdog_inject_delay():
> Injecting delay.
> [62939.824094] clocksource: clocksource_watchdog_inject_delay():
> Injecting delay.
> [62939.831314] clocksource: clocksource_watchdog_inject_delay():
> Injecting delay.
> [62939.838536] clocksource: timekeeping watchdog on CPU26: hpet
> read-back delay of 7220833ns, attempt 4, marking unstable
> [62939.849230] tsc: Marking TSC unstable due to clocksource watchdog
> [62939.855340] TSC found unstable after boot, most likely due to
> broken BIOS. Use 'tsc=unstable'.
> [62939.863972] sched_clock: Marking unstable (62943398530130,
> -3543150114)<-(62941186607503, -1331276112)
> [62939.875104] clocksource: Checking clocksource tsc synchronization
> from CPU 123 to CPUs 0,6,26,62,78,97-98,137.
> [62939.886518] clocksource: Switched to clocksource hpet
> 
> On Tue, Apr 27, 2021 at 2:27 AM Paul E. McKenney <paulmck@...nel.org> wrote:
> >
> > On Mon, Apr 26, 2021 at 10:56:27AM -0700, Andi Kleen wrote:
> > > > ------------------------------------------------------------------------
> > > >
> > > > - module parameters
> > > >
> > > >   If the scope of the fault injection capability is limited to a
> > > >   single kernel module, it is better to provide module parameters to
> > > >   configure the fault attributes.
> > > >
> > > > ------------------------------------------------------------------------
> > > >
> > > > And in this case, the fault injection capability is in fact limited to
> > > > kernel/clocksource.c.
> > >
> > >
> > > I disagree with this recommendation because it prevents fuzzer coverage.
> > >
> > > Much better to have an uniform interface that can be automatically
> > > explored.
> >
> > The permissions for these module parameters is 0644, so there is no
> > reason why the fuzzers cannot use them via sysfs.
> >
> >                                                         Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ