linux-kernel - Clock drift with GENERIC_TIME_VSYSCALL

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <20131122163815.393ab1f2@mschwide>
Date:	Fri, 22 Nov 2013 16:38:15 +0100
From:	Martin Schwidefsky <schwidefsky@...ibm.com>
To:	John Stultz <john.stultz@...aro.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	Paul Mackerras <paulus@...ba.org>,
	Tony Luck <tony.luck@...el.com>,
	Fenghua Yu <fenghua.yu@...el.com>
Cc:	linux-kernel@...r.kernel.org
Subject: Clock drift with GENERIC_TIME_VSYSCALL_OLD

Greetings,

I just hunted down a time related bug which caused the Linux internal
xtime to drift away from the precise hardware clock provided by the
TOD clock found in the s390 architecture.

After a long search I came along this lovely piece of code in
kernel/time/timekeeping.c:

#ifdef CONFIG_GENERIC_TIME_VSYSCALL_OLD
static inline void old_vsyscall_fixup(struct timekeeper *tk)

        s64 remainder;

        /*
        * Store only full nanoseconds into xtime_nsec after rounding
        * it up and add the remainder to the error difference.
        * XXX - This is necessary to avoid small 1ns inconsistnecies caused
        * by truncating the remainder in vsyscalls. However, it causes
        * additional work to be done in timekeeping_adjust(). Once
        * the vsyscall implementations are converted to use xtime_nsec
        * (shifted nanoseconds), and CONFIG_GENERIC_TIME_VSYSCALL_OLD
        * users are removed, this can be killed.
        */
        remainder = tk->xtime_nsec & ((1ULL << tk->shift) - 1);
        tk->xtime_nsec -= remainder;
        tk->xtime_nsec += 1ULL << tk->shift;
        tk->ntp_error += remainder << tk->ntp_error_shift;

}
#else
#define old_vsyscall_fixup(tk)
#endif

The highly precise result of our TOD clock source ends up in
tk->xtime_sec / tk->xtime_nsec where old_vsyscall_fixup just rounds
it up to the next nano-second (booo). To add insult to injury an
incorrect delta gets added to ntp_error, xtime has been forwarded by
((1ULL << tk->shift) - (tk->xtime_nsec & ((1ULL << tk->shift) - 1)))
and not set back by (tk->xtime_nsec & ((1ULL << tk->shift) - 1)).
xtime is too fast by one nano-second per tick. To verify that this
is indeed the problem I removed the line that adds the nano-second
to xtime_nsec and voila the clocks are in sync.

A possible patch to fix this would be:

--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -1347,6 +1347,7 @@ static inline void old_vsyscall_fixup(struct timekeeper *t
k)
        tk->xtime_nsec -= remainder;
        tk->xtime_nsec += 1ULL << tk->shift;
        tk->ntp_error += remainder << tk->ntp_error_shift;
+       tk->ntp_error -= (1ULL << tk->shift) << tk->ntp_error_shift;

 }
 #else

But that has the downside that it creates a negative ntp_error that
can only be corrected with an adjustment of tk->mult which takes a
long time.

The fix I am going to use is to convert s390 to GENERIC_TIME_VSYSCALL,
you might want to think about doing that for powerpc and ia64 as well.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/