lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1423749499-18520-1-git-send-email-prarit@redhat.com>
Date:	Thu, 12 Feb 2015 08:58:19 -0500
From:	Prarit Bhargava <prarit@...hat.com>
To:	linux-kernel@...r.kernel.org
Cc:	Prarit Bhargava <prarit@...hat.com>,
	John Stultz <john.stultz@...aro.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Miroslav Lichvar <mlichvar@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>
Subject: [PATCH] time, ntp: Do not update time_state in middle of leap second [v3]

During leap second insertion testing it was noticed that a small window
exists where the time_state could be reset such that
time_state = TIME_OK, which then causes the leap second to not occur, or
causes the entire leap second state machine to fail with time_state =
TIME_INS at the end of the leap second.

The test did the following in userspace:

        tx.modes = ADJ_STATUS;
        tx.status = STA_INS;

	/* send leap second request */
        ret = adjtimex(&tx);

        /* Check adjtimex output every half second */
        now = tx.time.tv_sec;
        while (now < next_leap+2) {
                char buf[26];
                ret = adjtimex(&tx);

                ctime_r(&tx.time.tv_sec, buf);
                buf[strlen(buf)-1] = 0; /*remove trailing\n */

                printf("%s + %6ld us\t%s\n",
                                buf,
                                tx.time.tv_usec,
                                time_state_str(ret));
                now = tx.time.tv_sec;
                /* Sleep for another half second */
                ts.tv_sec = 0;
                ts.tv_nsec = NSEC_PER_SEC/2;
                clock_nanosleep(CLOCK_MONOTONIC, 0, &ts, NULL);
        }

which was intended to mimic the insertion of a leap second.  A
successful run of the test would result in the time_state transitioning
from TIME_OK to TIME_INS, then to TIME_OOP when the leap second was
inserted, and then to TIME_WAIT when the leap second was completed.  While
running this code failures were seen in which the time_state remained TIME_INS,
even though the leap second had occurred.

After some investigation it was noted that the test contained a small error:
the test does not reinitialize tx.status and reissues the STA_INS every
1/2 second.  As a result of this broken test, the following failure was noticed
(the output below is a mix of kernel messages and the output from the test
program, the remaining annotations are printk's in the code and my own
additional notes):

[  942.952833] time_state [1] change from TIME_OK to TIME_INS

Fri Feb 13 18:59:51 2015 + 318126 us    TIME_INS
Fri Feb 13 18:59:51 2015 + 818167 us    TIME_INS
Fri Feb 13 18:59:52 2015 + 318208 us    TIME_INS
Fri Feb 13 18:59:52 2015 + 818248 us    TIME_INS
Fri Feb 13 18:59:53 2015 + 318290 us    TIME_INS
Fri Feb 13 18:59:53 2015 + 818331 us    TIME_INS
Fri Feb 13 18:59:54 2015 + 318372 us    TIME_INS
Fri Feb 13 18:59:54 2015 + 818413 us    TIME_INS
Fri Feb 13 18:59:55 2015 + 318454 us    TIME_INS
Fri Feb 13 18:59:55 2015 + 818495 us    TIME_INS
Fri Feb 13 18:59:56 2015 + 318534 us    TIME_INS
Fri Feb 13 18:59:56 2015 + 818575 us    TIME_INS
Fri Feb 13 18:59:57 2015 + 318617 us    TIME_INS
Fri Feb 13 18:59:57 2015 + 818660 us    TIME_INS
Fri Feb 13 18:59:58 2015 + 318702 us    TIME_INS
Fri Feb 13 18:59:58 2015 + 818744 us    TIME_INS
Fri Feb 13 18:59:59 2015 + 318785 us    TIME_INS
Fri Feb 13 18:59:59 2015 + 818837 us    TIME_INS

[  952.953143] time_state [4] change from TIME_INS to TIME_OOP
[  952.953150] Clock: inserting leap second 23:59:60 UTC
[  953.299905] process_adj_status: insert_leap_sec[1223] setting time_state back
to TIME_OK [1, 1]   <<< adjtimex() call every 1/2 second
[  953.299913] time_state [9] change from TIME_OOP to TIME_OK

Fri Feb 13 18:59:59 2015 + 318878 us    TIME_OK
Fri Feb 13 18:59:59 2015 + 818931 us    TIME_OK

[  954.064237] time_state [1] change from TIME_OK to TIME_INS

Fri Feb 13 19:00:00 2015 + 318972 us    TIME_INS
Fri Feb 13 19:00:00 2015 + 819012 us    TIME_INS
Fri Feb 13 19:00:01 2015 + 319051 us    TIME_INS
Fri Feb 13 19:00:01 2015 + 819089 us    TIME_INS
Fri Feb 13 19:00:02 2015 + 319128 us    TIME_INS

As previously stated, the time_state remains TIME_INS even though the leap
second has already occurred @ 952.953150.

The test was changed to reset tx.status to 0 in the loop, and the test then
succeeded with a 100% rate with the time state ending in TIME_WAIT.

While this is highly unlikely to ever happen in the real world it is
still something we should protect against, as breaking the state machine
is bad.

If the time_state == TIME_OOP (ie, the leap second is in progress) do not
allow an external update to time_state in process_adj_status().  This will
prevent external adjtimex() calls from breaking the leap second state
machine.

[v2]: Only block time_state change when TIME_OOP
[v3]: Write a much more detailed explanation of the bug.

Signed-off-by: Prarit Bhargava <prarit@...hat.com>
Cc: John Stultz <john.stultz@...aro.org>
Cc: Thomas Gleixner <tglx@...utronix.de>
Cc: Miroslav Lichvar <mlichvar@...hat.com>
Cc: Peter Zijlstra <peterz@...radead.org>
---
 kernel/time/ntp.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/time/ntp.c b/kernel/time/ntp.c
index 28bf91c..6ff5cd5 100644
--- a/kernel/time/ntp.c
+++ b/kernel/time/ntp.c
@@ -535,7 +535,8 @@ void ntp_notify_cmos_timer(void) { }
 static inline void process_adj_status(struct timex *txc, struct timespec64 *ts)
 {
 	if ((time_status & STA_PLL) && !(txc->status & STA_PLL)) {
-		time_state = TIME_OK;
+		if (time_state != TIME_OOP)
+			time_state = TIME_OK;
 		time_status = STA_UNSYNC;
 		/* restart PPS frequency calibration */
 		pps_reset_freq_interval();
-- 
1.7.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ