[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALAqxLU5iKW_VS-ukULUGwaPj3ZgHUUHyyQt_fKHat__K6NO1w@mail.gmail.com>
Date: Tue, 17 Feb 2015 15:16:18 -0800
From: John Stultz <john.stultz@...aro.org>
To: Prarit Bhargava <prarit@...hat.com>
Cc: lkml <linux-kernel@...r.kernel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Miroslav Lichvar <mlichvar@...hat.com>,
Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH] time, ntp: Do not update time_state in middle of leap
second [v3]
On Thu, Feb 12, 2015 at 5:58 AM, Prarit Bhargava <prarit@...hat.com> wrote:
> During leap second insertion testing it was noticed that a small window
> exists where the time_state could be reset such that
> time_state = TIME_OK, which then causes the leap second to not occur, or
> causes the entire leap second state machine to fail with time_state =
> TIME_INS at the end of the leap second.
>
> The test did the following in userspace:
>
> tx.modes = ADJ_STATUS;
> tx.status = STA_INS;
>
> /* send leap second request */
> ret = adjtimex(&tx);
>
> /* Check adjtimex output every half second */
> now = tx.time.tv_sec;
> while (now < next_leap+2) {
> char buf[26];
> ret = adjtimex(&tx);
>
> ctime_r(&tx.time.tv_sec, buf);
> buf[strlen(buf)-1] = 0; /*remove trailing\n */
>
> printf("%s + %6ld us\t%s\n",
> buf,
> tx.time.tv_usec,
> time_state_str(ret));
> now = tx.time.tv_sec;
> /* Sleep for another half second */
> ts.tv_sec = 0;
> ts.tv_nsec = NSEC_PER_SEC/2;
> clock_nanosleep(CLOCK_MONOTONIC, 0, &ts, NULL);
> }
>
> which was intended to mimic the insertion of a leap second. A
> successful run of the test would result in the time_state transitioning
> from TIME_OK to TIME_INS, then to TIME_OOP when the leap second was
> inserted, and then to TIME_WAIT when the leap second was completed. While
> running this code failures were seen in which the time_state remained TIME_INS,
> even though the leap second had occurred.
>
> After some investigation it was noted that the test contained a small error:
> the test does not reinitialize tx.status and reissues the STA_INS every
> 1/2 second. As a result of this broken test, the following failure was noticed
> (the output below is a mix of kernel messages and the output from the test
> program, the remaining annotations are printk's in the code and my own
> additional notes):
>
> [ 942.952833] time_state [1] change from TIME_OK to TIME_INS
>
> Fri Feb 13 18:59:51 2015 + 318126 us TIME_INS
> Fri Feb 13 18:59:51 2015 + 818167 us TIME_INS
> Fri Feb 13 18:59:52 2015 + 318208 us TIME_INS
> Fri Feb 13 18:59:52 2015 + 818248 us TIME_INS
> Fri Feb 13 18:59:53 2015 + 318290 us TIME_INS
> Fri Feb 13 18:59:53 2015 + 818331 us TIME_INS
> Fri Feb 13 18:59:54 2015 + 318372 us TIME_INS
> Fri Feb 13 18:59:54 2015 + 818413 us TIME_INS
> Fri Feb 13 18:59:55 2015 + 318454 us TIME_INS
> Fri Feb 13 18:59:55 2015 + 818495 us TIME_INS
> Fri Feb 13 18:59:56 2015 + 318534 us TIME_INS
> Fri Feb 13 18:59:56 2015 + 818575 us TIME_INS
> Fri Feb 13 18:59:57 2015 + 318617 us TIME_INS
> Fri Feb 13 18:59:57 2015 + 818660 us TIME_INS
> Fri Feb 13 18:59:58 2015 + 318702 us TIME_INS
> Fri Feb 13 18:59:58 2015 + 818744 us TIME_INS
> Fri Feb 13 18:59:59 2015 + 318785 us TIME_INS
> Fri Feb 13 18:59:59 2015 + 818837 us TIME_INS
>
> [ 952.953143] time_state [4] change from TIME_INS to TIME_OOP
> [ 952.953150] Clock: inserting leap second 23:59:60 UTC
> [ 953.299905] process_adj_status: insert_leap_sec[1223] setting time_state back
> to TIME_OK [1, 1] <<< adjtimex() call every 1/2 second
> [ 953.299913] time_state [9] change from TIME_OOP to TIME_OK
>
> Fri Feb 13 18:59:59 2015 + 318878 us TIME_OK
> Fri Feb 13 18:59:59 2015 + 818931 us TIME_OK
>
> [ 954.064237] time_state [1] change from TIME_OK to TIME_INS
>
> Fri Feb 13 19:00:00 2015 + 318972 us TIME_INS
> Fri Feb 13 19:00:00 2015 + 819012 us TIME_INS
> Fri Feb 13 19:00:01 2015 + 319051 us TIME_INS
> Fri Feb 13 19:00:01 2015 + 819089 us TIME_INS
> Fri Feb 13 19:00:02 2015 + 319128 us TIME_INS
>
> As previously stated, the time_state remains TIME_INS even though the leap
> second has already occurred @ 952.953150.
>
> The test was changed to reset tx.status to 0 in the loop, and the test then
> succeeded with a 100% rate with the time state ending in TIME_WAIT.
>
> While this is highly unlikely to ever happen in the real world it is
> still something we should protect against, as breaking the state machine
> is bad.
>
> If the time_state == TIME_OOP (ie, the leap second is in progress) do not
> allow an external update to time_state in process_adj_status(). This will
> prevent external adjtimex() calls from breaking the leap second state
> machine.
>
> [v2]: Only block time_state change when TIME_OOP
> [v3]: Write a much more detailed explanation of the bug.
Ok, thanks for the more verbose explanation. Although this is more a
history of what you've seen rather then the crux of the change.
To distill this down just a bit, the point is the usual mode for NTP
time_state machine looks like:
TIME_OK -> TIME_INS -> TIME_OOP
| |
v v
TIME_DEL ------------> TIME_WAIT -(back)-> TIME_OK
(hopefully the ascii art survives here)
Now, from any of these states, currently if adjtimex is called w/ the
STA_PLL bit cleared (after STA_PLL was set), we reset back to TIME_OK,
effectively cancelling any transitions. (You'll have to imagine a line
from any of the states back to TIME_OK, since that's going to be too
ugly to do in ascii)
Your patch is trying to remove the line back from TIME_OOP back to
TIME_OK. Basically stopping the ability to reset the ntp state during
a leapsecond.
I do get that the behavior seen was strange due to a bug in the test
code which caused unexpected cancellation of state, but I'm not sure
if we should change the behavior to enforce that cancellation not be
possible. I could imagine some logic which really wants to reset the
state, which just by chance lands during a leap second, and the
application is confused since the state change didn't occur as
expected.
So I guess I'm not seeing that the state machine is actually "broken"
in this case that you've outlined. If you can articulate better why
the OOP -> OK transition is truly invalid, I'd be interested in
hearing, but I'm not sure I want to risk a behavioral change unless
there's wide agreement.
thanks
-john
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists