lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.20.1612090953570.3794@nanos>
Date:   Fri, 9 Dec 2016 10:59:51 +0100 (CET)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Roland Scheidegger <rscheidegger_lists@...peed.ch>
cc:     sroland@...are.com, Ingo Molnar <mingo@...hat.com>,
        "H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] x86/tsc: RFC: re-synchronize TSCs to boot cpu TSC

On Fri, 9 Dec 2016, Roland Scheidegger wrote:
> 
> I saw some system lockups though:
> When doing a cold boot, this kernel never managed to boot up. The last
> message seen is:
> x86: Booting SMP configuration:
> .... node  #0, CPUs:        #1

Weird. That really would be interesting to figure out what goes wrong
there. What bothers me is that we don't see something like this:

> [    0.172334] TSC ADJUST differs: Reference CPU0: -577421768610 CPU1:
> -577423766270

Can you please apply the debug patch below and provide the output ?


> [    0.094492] x86: Booting SMP configuration:
> [    0.094534] .... node  #0, CPUs:        #1
> [    0.172334] TSC ADJUST differs: Reference CPU0: -577421768610 CPU1:
> -577423766270

What on earth is this BIOS doing? That's a couple of minutes back in time.

And the difference between the max and min adjust value is 2050932 cycles.

> Without the patches on cold boot it just was as expected:
> [    0.093700] x86: Booting SMP configuration:
> [    0.093737] .... node  #0, CPUs:        #1
> [    0.174304] TSC synchronization [CPU#0 -> CPU#1]:
> [    0.174375] Measured 1837188 cycles TSC warp between CPUs, turning
> off TSC clock.

Not surprising given the above numbers.

Thanks,

	tglx

8<-------------------------
--- a/arch/x86/kernel/tsc_sync.c
+++ b/arch/x86/kernel/tsc_sync.c
@@ -95,6 +95,8 @@ bool tsc_store_and_check_tsc_adjust(void
 	cur->nextcheck = jiffies + HZ;
 	cur->warned = false;
 
+	pr_info("TSC ADJUST: CPU%u: %lld\n", cpu,  bootval);
+
 	/*
 	 * Check whether this CPU is the first in a package to come up. In
 	 * this case do not check the boot value against another package
@@ -282,17 +284,22 @@ void check_tsc_sync_source(int cpu)
 	else
 		atomic_set(&test_runs, 3);
 retry:
+	pr_info("TSC source sync %d -> %d runs %d\n", smp_processor_id(),
+		cpu, atomic_read(&test_runs));
 	/*
 	 * Wait for the target to start or to skip the test:
 	 */
 	while (atomic_read(&start_count) != cpus - 1) {
 		if (atomic_read(&skip_test) > 0) {
+			pr_info("TSC source sync skipped\n");
 			atomic_set(&skip_test, 0);
 			return;
 		}
 		cpu_relax();
 	}
 
+	pr_info("TSC source sync starting\n");
+
 	/*
 	 * Trigger the target to continue into the measurement too:
 	 */
@@ -300,6 +307,7 @@ void check_tsc_sync_source(int cpu)
 
 	check_tsc_warp(loop_timeout(cpu));
 
+	pr_info("TSC source sync wait for stop\n");
 	while (atomic_read(&stop_count) != cpus-1)
 		cpu_relax();
 
@@ -311,7 +319,7 @@ void check_tsc_sync_source(int cpu)
 	if (!nr_warps) {
 		atomic_set(&test_runs, 0);
 
-		pr_debug("TSC synchronization [CPU#%d -> CPU#%d]: passed\n",
+		pr_info("TSC synchronization [CPU#%d -> CPU#%d]: passed\n",
 			smp_processor_id(), cpu);
 
 	} else if (atomic_dec_and_test(&test_runs) || random_warps) {
@@ -367,6 +375,7 @@ void check_tsc_sync_target(void)
 	 * successful skip the test.
 	 */
 	if (tsc_store_and_check_tsc_adjust()) {
+		pr_info("TSC target sync skip\n");
 		atomic_inc(&skip_test);
 		return;
 	}
@@ -376,6 +385,7 @@ void check_tsc_sync_target(void)
 	 * Register this CPU's participation and wait for the
 	 * source CPU to start the measurement:
 	 */
+	pr_info("TSC target sync starting\n");
 	atomic_inc(&start_count);
 	while (atomic_read(&start_count) != cpus)
 		cpu_relax();
@@ -392,6 +402,7 @@ void check_tsc_sync_target(void)
 	 */
 	atomic_inc(&stop_count);
 
+	pr_info("TSC target sync stop\n");
 	/*
 	 * Wait for the source CPU to print stuff:
 	 */
@@ -403,6 +414,8 @@ void check_tsc_sync_target(void)
 	 */
 	atomic_set(&stop_count, 0);
 
+	pr_info("TSC target sync remaining runs %d\n", atomic_read(&test_runs));
+
 	/*
 	 * Check the number of remaining test runs. If not zero, the test
 	 * failed and a retry with adjusted TSC is possible. If zero the

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ