lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 6 Nov 2018 06:35:36 -0500
From:   Steven Sistare <steven.sistare@...cle.com>
To:     Dominique Martinet <asmadeus@...ewreck.org>,
        Pavel Tatashin <pavel.tatashin@...rosoft.com>
Cc:     daniel.m.jordan@...cle.com, linux@...linux.org.uk,
        schwidefsky@...ibm.com, heiko.carstens@...ibm.com,
        john.stultz@...aro.org, sboyd@...eaurora.org, x86@...nel.org,
        linux-kernel@...r.kernel.org, mingo@...hat.com, tglx@...utronix.de,
        hpa@...or.com, douly.fnst@...fujitsu.com, peterz@...radead.org,
        prarit@...hat.com, feng.tang@...el.com, pmladek@...e.com,
        gnomes@...rguk.ukuu.org.uk, linux-s390@...r.kernel.org,
        boris.ostrovsky@...cle.com, jgross@...e.com, pbonzini@...hat.com,
        virtualization@...ts.linux-foundation.org, kvm@...r.kernel.org,
        qemu-devel@...gnu.org
Subject: Re: [PATCH v15 23/26] sched: early boot clock

Pavel has a new email address, cc'd - steve

On 11/6/2018 12:42 AM, Dominique Martinet wrote:
> (added various kvm/virtualization lists in Cc as well as qemu as I don't
> know who's "wrong" here)
> 
> Pavel Tatashin wrote on Thu, Jul 19, 2018:
>> Allow sched_clock() to be used before schec_clock_init() is called.
>> This provides with a way to get early boot timestamps on machines with
>> unstable clocks.
> 
> This isn't something I understand, but bisect tells me this patch
> (landed as 857baa87b64 ("sched/clock: Enable sched clock early")) makes
> a VM running with kvmclock take a step in uptime/printk timer early in
> boot sequence as illustrated below. The step seems to be related to the
> amount of time the host was suspended while qemu was running before the
> reboot.
> 
> $ dmesg
> ...
> [    0.000000] SMBIOS 2.8 present.
> [    0.000000] DMI: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29 04/01/2014
> [    0.000000] Hypervisor detected: KVM
> [    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
> [283120.529821] kvm-clock: cpu 0, msr 321a8001, primary cpu clock
> [283120.529822] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
> [283120.529824] tsc: Detected 2592.000 MHz processor
> ...
> 
> (The VM is x86_64 on x86_64, I can provide my .config on request but
> don't think it's related)
> 
> 
> It's rather annoying for me as I often reboot VMs and rely on the
> 'uptime' command to check if I did just reboot or not as I have the
> attention span of a goldfish; I'd rather not have to find something else
> to check if I did just reboot or not.
> 
> Note that if the qemu process is restarted, there is no offset anymore.
> 
> I unfortunately just did that so cannot say with confidence (putting my
> laptop to sleep for 30s only led to a 2s offset and I do not want to
> wait longer right now), but it looks like the clock is still mostly
> correct after reboot after disabling my VM's ntp client. Will infirm
> that tomorrow if I was wrong.
> 
> 
> Happy to try to help fixing this in any way, as written above the quote
> I'm not even actually sure who is wrong here.
> 
> Thanks!
> 
> 
> 
> (As a side, mostly unrelated note, insert swearing here about cf7a63ef4
> not compiling earlier in this serie; some variable declaration got
> removed before their use. Was fixed in the next patch but I didn't
> notice the kernel didn't fully rebuild and wasted time in my bisect
> heading the wrong way...)
> 
>> Signed-off-by: Pavel Tatashin <pasha.tatashin@...cle.com>
>> ---
>>  init/main.c          |  2 +-
>>  kernel/sched/clock.c | 20 +++++++++++++++++++-
>>  2 files changed, 20 insertions(+), 2 deletions(-)
>>
>> diff --git a/init/main.c b/init/main.c
>> index 162d931c9511..ff0a24170b95 100644
>> --- a/init/main.c
>> +++ b/init/main.c
>> @@ -642,7 +642,6 @@ asmlinkage __visible void __init start_kernel(void)
>>  	softirq_init();
>>  	timekeeping_init();
>>  	time_init();
>> -	sched_clock_init();
>>  	printk_safe_init();
>>  	perf_event_init();
>>  	profile_init();
>> @@ -697,6 +696,7 @@ asmlinkage __visible void __init start_kernel(void)
>>  	acpi_early_init();
>>  	if (late_time_init)
>>  		late_time_init();
>> +	sched_clock_init();
>>  	calibrate_delay();
>>  	pid_idr_init();
>>  	anon_vma_init();
>> diff --git a/kernel/sched/clock.c b/kernel/sched/clock.c
>> index 0e9dbb2d9aea..422cd63f8f17 100644
>> --- a/kernel/sched/clock.c
>> +++ b/kernel/sched/clock.c
>> @@ -202,7 +202,25 @@ static void __sched_clock_gtod_offset(void)
>>  
>>  void __init sched_clock_init(void)
>>  {
>> +	unsigned long flags;
>> +
>> +	/*
>> +	 * Set __gtod_offset such that once we mark sched_clock_running,
>> +	 * sched_clock_tick() continues where sched_clock() left off.
>> +	 *
>> +	 * Even if TSC is buggered, we're still UP at this point so it
>> +	 * can't really be out of sync.
>> +	 */
>> +	local_irq_save(flags);
>> +	__sched_clock_gtod_offset();
>> +	local_irq_restore(flags);
>> +
>>  	sched_clock_running = 1;
>> +
>> +	/* Now that sched_clock_running is set adjust scd */
>> +	local_irq_save(flags);
>> +	sched_clock_tick();
>> +	local_irq_restore(flags);
>>  }
>>  /*
>>   * We run this as late_initcall() such that it runs after all built-in drivers,
>> @@ -356,7 +374,7 @@ u64 sched_clock_cpu(int cpu)
>>  		return sched_clock() + __sched_clock_offset;
>>  
>>  	if (unlikely(!sched_clock_running))
>> -		return 0ull;
>> +		return sched_clock();
>>  
>>  	preempt_disable_notrace();
>>  	scd = cpu_sdc(cpu);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ