lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 30 Nov 2021 06:40:48 -0800
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Feng Tang <feng.tang@...el.com>
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...el.com>,
        "H . Peter Anvin" <hpa@...or.com>,
        Peter Zijlstra <peterz@...radead.org>, x86@...nel.org,
        linux-kernel@...r.kernel.org, rui.zhang@...el.com,
        andi.kleen@...el.com, len.brown@...el.com, tim.c.chen@...el.com
Subject: Re: [PATCH v3 2/2] x86/tsc: skip tsc watchdog checking for qualified
 platforms

On Tue, Nov 30, 2021 at 02:46:23PM +0800, Feng Tang wrote:
> On Wed, Nov 17, 2021 at 10:37:51AM +0800, Feng Tang wrote:
> > There are cases that tsc clocksources are wrongly judged as unstable by
> > clocksource watchdogs like hpet, acpi_pm or 'refined-jiffies'. While
> > there is hardly a general reliable way to check the validity of a
> > watchdog, and to protect the innocent tsc, Thomas Gleixner proposed [1]:
> 
> Hi All,
> 
> Some more update, last week we got report from validation team that
> the "tsc judged as unstable" happened on latest desktop platform,
> which has serial earlyprintk enabled, and the watchdog here is
> 'refined-jiffies' while hpet is disabled during the PC10 check. I
> tried severy other client platforms I can find: Kabylake, Icelake
> and Alderlake, and the mis-judging can be easily reproduced on
> Icelake and Alderlake (not on Kabylake). Which could be cued by
> this 2/2 patch.
> 
> Also, today we got same report on a 2-sockets Icelake Server with
> 5.5 kernel, while the watchdog is 'hpet', and the system is running
> stressful big-data workload.

Were these tests run with Waiman's latest patch series?  The first
two of them are on RCU's "dev" branch.

							Thanx, Paul

> Thanks,
> Feng
> 
> 
> > "I'm inclined to lift that requirement when the CPU has:
> > 
> >     1) X86_FEATURE_CONSTANT_TSC
> >     2) X86_FEATURE_NONSTOP_TSC
> >     3) X86_FEATURE_NONSTOP_TSC_S3
> >     4) X86_FEATURE_TSC_ADJUST
> >     5) At max. 4 sockets
> > 
> >  After two decades of horrors we're finally at a point where TSC seems
> >  to be halfway reliable and less abused by BIOS tinkerers. TSC_ADJUST
> >  was really key as we can now detect even small modifications reliably
> >  and the important point is that we can cure them as well (not pretty
> >  but better than all other options)."
> > 
> > As feature #3 X86_FEATURE_NONSTOP_TSC_S3 only exists on several generations
> > of Atom processor, and is always coupled with X86_FEATURE_CONSTANT_TSC
> > and X86_FEATURE_NONSTOP_TSC, skip checking it, and also be more defensive
> > to use maxim of 2 sockets.
> > 
> > The check is done inside tsc_init() before registering 'tsc-early' and
> > 'tsc' clocksources, as there were cases that both of them had been
> > wrongly judged as unreliable.
> > 
> > For more background of tsc/watchdog, there is a good summary in [2]
> > 
> > [1]. https://lore.kernel.org/lkml/87eekfk8bd.fsf@nanos.tec.linutronix.de/
> > [2]. https://lore.kernel.org/lkml/87a6pimt1f.ffs@nanos.tec.linutronix.de/
> > Suggested-by: Thomas Gleixner <tglx@...utronix.de>
> > Signed-off-by: Feng Tang <feng.tang@...el.com>
> > ---
> > Change log:
> > 
> >   v3:
> >     * rebased against 5.16-rc1
> >     * refine commit log
> > 
> >   v2:
> >     * Directly skip watchdog check without messing flag
> >       'tsc_clocksource_reliable' (Thomas)
> > 
> >  arch/x86/kernel/tsc.c | 22 ++++++++++++++++++----
> >  1 file changed, 18 insertions(+), 4 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> > index 2e076a459a0c..389511f59101 100644
> > --- a/arch/x86/kernel/tsc.c
> > +++ b/arch/x86/kernel/tsc.c
> > @@ -1180,6 +1180,12 @@ void mark_tsc_unstable(char *reason)
> >  
> >  EXPORT_SYMBOL_GPL(mark_tsc_unstable);
> >  
> > +static void __init tsc_skip_watchdog_verify(void)
> > +{
> > +	clocksource_tsc_early.flags &= ~CLOCK_SOURCE_MUST_VERIFY;
> > +	clocksource_tsc.flags &= ~CLOCK_SOURCE_MUST_VERIFY;
> > +}
> > +
> >  static void __init check_system_tsc_reliable(void)
> >  {
> >  #if defined(CONFIG_MGEODEGX1) || defined(CONFIG_MGEODE_LX) || defined(CONFIG_X86_GENERIC)
> > @@ -1196,6 +1202,17 @@ static void __init check_system_tsc_reliable(void)
> >  #endif
> >  	if (boot_cpu_has(X86_FEATURE_TSC_RELIABLE))
> >  		tsc_clocksource_reliable = 1;
> > +
> > +	/*
> > +	 * Ideally the socket number should be checked, but this is called
> > +	 * by tsc_init() which is in early boot phase and the socket numbers
> > +	 * may not be available. Use 'nr_online_nodes' as a fallback solution
> > +	 */
> > +	if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
> > +	    boot_cpu_has(X86_FEATURE_NONSTOP_TSC) &&
> > +	    boot_cpu_has(X86_FEATURE_TSC_ADJUST) &&
> > +	    nr_online_nodes <= 2)
> > +		tsc_skip_watchdog_verify();
> >  }
> >  
> >  /*
> > @@ -1387,9 +1404,6 @@ static int __init init_tsc_clocksource(void)
> >  	if (tsc_unstable)
> >  		goto unreg;
> >  
> > -	if (tsc_clocksource_reliable || no_tsc_watchdog)
> > -		clocksource_tsc.flags &= ~CLOCK_SOURCE_MUST_VERIFY;
> > -
> >  	if (boot_cpu_has(X86_FEATURE_NONSTOP_TSC_S3))
> >  		clocksource_tsc.flags |= CLOCK_SOURCE_SUSPEND_NONSTOP;
> >  
> > @@ -1527,7 +1541,7 @@ void __init tsc_init(void)
> >  	}
> >  
> >  	if (tsc_clocksource_reliable || no_tsc_watchdog)
> > -		clocksource_tsc_early.flags &= ~CLOCK_SOURCE_MUST_VERIFY;
> > +		tsc_skip_watchdog_verify();
> >  
> >  	clocksource_register_khz(&clocksource_tsc_early, tsc_khz);
> >  	detect_art();
> > -- 
> > 2.27.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ