[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2d5087a2-2aca-aaba-4608-112facbd3208@oracle.com>
Date: Mon, 28 Aug 2017 10:17:09 -0400
From: Pasha Tatashin <pasha.tatashin@...cle.com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: x86@...nel.org, linux-kernel@...r.kernel.org, mingo@...hat.com,
peterz@...radead.org, hpa@...or.com, douly.fnst@...fujitsu.com
Subject: Re: [PATCH v5 1/2] sched/clock: interface to allow timestamps early
in boot
Hi Thomas,
Thank you for your comments. My replies below.
>> +/*
>> + * Called once during to boot to initialize boot time.
>> + */
>> +void read_boot_clock64(struct timespec64 *ts)
>
> And because its called only once, it does not need to be marked __init()
> and must be kept around forever, right?
This is because every other architecture implements read_boot_clock64()
without __init: arm, s390. Beside, the original weak stub does not have
__init macro. So, I can certainly try to add it for x86, but I am not
sure what is the behavior once __init section is gone, but weak
implementation stays.
>
>> +{
>> + u64 ns_boot = sched_clock_early(); /* nsec from boot */
>
> Please do not use tail comments. They are a horrible habit.
>
> Instead of adding this crap you'd have better spent time in adding proper
> comments explaining the reasoning behind this function,
OK, I will add introduction comment, and remove the tail comment.
> This is really broken. Look at the time keeping init code. It does:
>
> read_persistent_clock64(&now);
> ...
> read_boot_clock64(&boot);
> ...
> tk_set_xtime(tk, &now);
> ...
> set_normalized_timespec64(&tmp, -boot.tv_sec, -boot.tv_nsec);
> tk_set_wall_to_mono(tk, tmp);
>
> Lets assume that the initial read_persistent_clock64() happens right before
> the second. For simplicity lets assume we get 1000 seconds since the epoch.
>
> Now read_boot_clock() reads sched_clock_early() which returns 1 second.
>
> The second read_persistent_clock64() returns 1001 seconds since the epoch
> because the RTC advanced by now. So the resulting time stamp is going to be
> 1000s since the epoch.
>
> In case the RTC still returns 100 since the epoch, the resulting time stamp
> is 999s since the epoch.
>
> A full second difference. That's time stamp lottery but nothing which we
> want to base any boot time analysis on.
>
> You have to come up with something more useful than that.
>
This makes sense. Changing order in timekeeping_init(void) should take
care of this:
Change to:
void __init timekeeping_init(void)
{
/*
* We must determine boot timestamp before getting current
* persistent clock value, because implementation of
* read_boot_clock64() might also call the persistent
* clock, and a leap second may occur.
*/
read_boot_clock64(&boot);
...
read_persistent_clock64(&now);
...
}
Powered by blists - more mailing lists