lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87mt8pkzw1.ffs@tglx>
Date:   Thu, 17 Nov 2022 22:57:34 +0100
From:   Thomas Gleixner <tglx@...utronix.de>
To:     "Paul E. McKenney" <paulmck@...nel.org>
Cc:     linux-kernel@...r.kernel.org, john.stultz@...aro.org,
        sboyd@...nel.org, corbet@....net, Mark.Rutland@....com,
        maz@...nel.org, kernel-team@...a.com, neeraju@...eaurora.org,
        ak@...ux.intel.com, feng.tang@...el.com, zhengjun.xing@...el.com,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Chris Mason <clm@...a.com>, John Stultz <jstultz@...gle.com>,
        Waiman Long <longman@...hat.com>
Subject: Re: [PATCH clocksource 1/3] clocksource: Reject bogus watchdog
 clocksource measurements

Paul!

On Mon, Nov 14 2022 at 15:28, Paul E. McKenney wrote:
>  
> +		/* Check for bogus measurements. */
> +		wdi = jiffies_to_nsecs(WATCHDOG_INTERVAL);
> +		if (wd_nsec < (wdi >> 2)) {
> +			pr_warn("timekeeping watchdog on CPU%d: Watchdog clocksource '%s' advanced only %lld ns during %d-jiffy time interval, skipping watchdog check.\n", smp_processor_id(), watchdog->name, wd_nsec, WATCHDOG_INTERVAL);
> +			continue;
> +		}
> +		if (wd_nsec > (wdi << 2)) {
> +			pr_warn("timekeeping watchdog on CPU%d: Watchdog clocksource '%s' advanced an excessive %lld ns during %d-jiffy time interval, probable CPU overutilization, skipping watchdog check.\n", smp_processor_id(), watchdog->name, wd_nsec, WATCHDOG_INTERVAL);
> +			continue;
> +		}

This is really getting ridiculous.

The clocksource watchdog is supposed to run periodically with period =
WATCHDOG_INTERVAL.

That periodic schedule depends on the clocksource which is monitored. If
the clocksource runs fast the period is shortened and if it runs slow is
prolonged.

Now you add checks:

 1) If the period observed by the watchdog clocksource is less than 1/4
    of the expected period, everything is fine.

 2) If the period observed by the watchdog clocksource is greater than 4
    times the expected period, everything is fine.

IOW, you are preventing detection of one class of problems which caused
us to implement the watchdog clocksource in the first place.

You are preventing it by making the watchdog decision circular dependent
on the clocksource it is supposed to watch. IOW, you put a fox in charge
of the henhouse. That's a really brilliant plan.

But what's worse is the constant stream of heuristics which make the
clocksource watchdog "work" under workloads which are simply impossible
to be handled by its current implementation.

If I look at the full set of them by now, I'm pretty sure that a real
TSC fail would not be noticed anymore because there are more exceptions
and excuses why a particular measurement it bogus or invalid or
whatever.

I didn't do a full analysis yet, but I have a hard time to convince
myself that - assumed we add this gem - the watchdog will be anything
else than a useless waste of CPU cycles as there is always one of the
heuristics declaring that everything is fine along with incomprehensible
messages in dmesg which will create more confusion to most people than
being helpful.

This is hunting us for 20+ years now and why do we still need this? I'm
pretty sure that farcebook does not run their server farms on 20 years
old silicon.

That means even the largest customers have not been able to convince the
CPU manufactures to fix this idiocy by now, right? Either that or they
did not even try because it's simpler to "fix" it in software.

That's the only two explanations I have for the constant stream of
voodoo logic. Both are just a proof for my claim that this industry just
"works" by chance.

Can we stop this pretty please and either come up with something
fundamentally different or just admit defeat and remove the whole thing?

Thanks,

        tglx

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ