linux-kernel - Re: [PATCH v5 1/2] sched/clock: interface to allow timestamps early in boot

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <2d5087a2-2aca-aaba-4608-112facbd3208@oracle.com>
Date:   Mon, 28 Aug 2017 10:17:09 -0400
From:   Pasha Tatashin <pasha.tatashin@...cle.com>
To:     Thomas Gleixner <tglx@...utronix.de>
Cc:     x86@...nel.org, linux-kernel@...r.kernel.org, mingo@...hat.com,
        peterz@...radead.org, hpa@...or.com, douly.fnst@...fujitsu.com
Subject: Re: [PATCH v5 1/2] sched/clock: interface to allow timestamps early
 in boot

Hi Thomas,

Thank you for your comments. My replies below.

>> +/*
>> + * Called once during to boot to initialize boot time.
>> + */
>> +void read_boot_clock64(struct timespec64 *ts)
> 
> And because its called only once, it does not need to be marked __init()
> and must be kept around forever, right?

This is because every other architecture implements read_boot_clock64() 
without __init: arm, s390. Beside, the original weak stub does not have 
__init macro. So, I can certainly try to add it for x86, but I am not 
sure what is the behavior once __init section is gone, but weak 
implementation stays.

> 
>> +{
>> +	u64 ns_boot = sched_clock_early(); /* nsec from boot */
> 
> Please do not use tail comments. They are a horrible habit.
> 
> Instead of adding this crap you'd have better spent time in adding proper
> comments explaining the reasoning behind this function,

OK, I will add introduction comment, and remove the tail comment.

> This is really broken. Look at the time keeping init code. It does:
> 
>       read_persistent_clock64(&now);
>       ...
>       read_boot_clock64(&boot);
>       ...
>       tk_set_xtime(tk, &now);
>       ...
>       set_normalized_timespec64(&tmp, -boot.tv_sec, -boot.tv_nsec);
>       tk_set_wall_to_mono(tk, tmp);
> 
> Lets assume that the initial read_persistent_clock64() happens right before
> the second. For simplicity lets assume we get 1000 seconds since the epoch.
> 
> Now read_boot_clock() reads sched_clock_early() which returns 1 second.
> 
> The second read_persistent_clock64() returns 1001 seconds since the epoch
> because the RTC advanced by now. So the resulting time stamp is going to be
> 1000s since the epoch.
> 
> In case the RTC still returns 100 since the epoch, the resulting time stamp
> is 999s since the epoch.
> 
> A full second difference. That's time stamp lottery but nothing which we
> want to base any boot time analysis on.
> 
> You have to come up with something more useful than that.
> 

This makes sense. Changing order in timekeeping_init(void) should take 
care of this:

Change to:

void __init timekeeping_init(void)
{
	/*
	 * We must determine boot timestamp before getting current  	
	 * persistent clock value, because implementation of
	 * read_boot_clock64() might also call the persistent
	 * clock, and a leap second may occur.
	 */

	read_boot_clock64(&boot);
	...
	read_persistent_clock64(&now);
	...
}