lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <86521835-f13f-4d43-9a38-9a55abae0b89@paulmck-laptop>
Date:   Wed, 7 Jun 2023 12:04:49 -0700
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Feng Tang <feng.tang@...el.com>
Cc:     Chris Bainbridge <chris.bainbridge@...il.com>, tglx@...utronix.de,
        sboyd@...nel.org, LKML <linux-kernel@...r.kernel.org>
Subject: Re: PROBLEM: skew message does not handle negative ns skew

On Tue, Jun 06, 2023 at 09:52:11PM +0800, Feng Tang wrote:
> On Tue, Jun 06, 2023 at 02:09:08PM +0100, Chris Bainbridge wrote:
> > On Tue, 6 Jun 2023 at 13:50, Feng Tang <feng.tang@...el.com> wrote:
> > >
> > > And I'm have no idea if there is a real hardware/firmware issue
> > > or just a false alarm.
> > 
> > Is a negative reported skew valid? I don't know, I had assumed so, so
> > the problem was the conversion from -878159 ns to 18446744073708 ms.
> 
> I think it's valid. The related code is from kernel/time/clocksource.c: 
> 
> 	"
> 	cs_wd_msec = div_u64_rem(cs_nsec - wd_nsec, 1000U * 1000U, &wd_rem);
> 	wd_msec = div_u64_rem(wd_nsec, 1000U * 1000U, &wd_rem);
> 	pr_warn("                      Clocksource '%s' skewed %lld ns (%lld ms) over watchdog '%s' interval of %lld ns (%lld ms)\n",
> 		cs->name, cs_nsec - wd_nsec, cs_wd_msec, watchdog->name, wd_nsec, wd_msec);
> 	"
> 
> The negative value just means the watchdog is running faster than
> TSC in the 512 ms checking interval. The 18446744073708 ms is just
> a conversion from s64 value in ns (-878159) to a u64 ns, then a
> u64 ms. 

That is a bit user-unfriendly.  Does the following fix address this
issue at your end?

							Thanx, Paul

------------------------------------------------------------------------

commit 8eb836f2dd44cb1e80dfc603cf47c03603dadcdb
Author: Paul E. McKenney <paulmck@...nel.org>
Date:   Wed Jun 7 11:59:49 2023 -0700

    clocksource: Handle negative skews in "skew is too large" messages
    
    The nanosecond-to-millisecond skew computation uses unsigned arithmetic,
    which produces user-unfriendly large positive numbers for negative skews.
    Therefore, use signed arithmetic for this computation in order to preserve
    the negativity.
    
    Reported-by: Chris Bainbridge <chris.bainbridge@...il.com>
    Reported-by: Feng Tang <feng.tang@...el.com>
    Fixes: dd029269947a ("clocksource: Improve "skew is too large" messages")
    Signed-off-by: Paul E. McKenney <paulmck@...nel.org>

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index 91836b727cef..0600e16dbafe 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -473,8 +473,8 @@ static void clocksource_watchdog(struct timer_list *unused)
 		/* Check the deviation from the watchdog clocksource. */
 		md = cs->uncertainty_margin + watchdog->uncertainty_margin;
 		if (abs(cs_nsec - wd_nsec) > md) {
-			u64 cs_wd_msec;
-			u64 wd_msec;
+			s64 cs_wd_msec;
+			s64 wd_msec;
 			u32 wd_rem;
 
 			pr_warn("timekeeping watchdog on CPU%d: Marking clocksource '%s' as unstable because the skew is too large:\n",
@@ -483,8 +483,8 @@ static void clocksource_watchdog(struct timer_list *unused)
 				watchdog->name, wd_nsec, wdnow, wdlast, watchdog->mask);
 			pr_warn("                      '%s' cs_nsec: %lld cs_now: %llx cs_last: %llx mask: %llx\n",
 				cs->name, cs_nsec, csnow, cslast, cs->mask);
-			cs_wd_msec = div_u64_rem(cs_nsec - wd_nsec, 1000U * 1000U, &wd_rem);
-			wd_msec = div_u64_rem(wd_nsec, 1000U * 1000U, &wd_rem);
+			cs_wd_msec = div_s64_rem(cs_nsec - wd_nsec, 1000 * 1000, &wd_rem);
+			wd_msec = div_s64_rem(wd_nsec, 1000 * 1000, &wd_rem);
 			pr_warn("                      Clocksource '%s' skewed %lld ns (%lld ms) over watchdog '%s' interval of %lld ns (%lld ms)\n",
 				cs->name, cs_nsec - wd_nsec, cs_wd_msec, watchdog->name, wd_nsec, wd_msec);
 			if (curr_clocksource == cs)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ