lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49fe8cc5-0f0f-6cac-7a5c-803e81f5667d@runbox.com>
Date:   Thu, 10 Nov 2016 06:57:00 +0300
From:   "M. Vefa Bicakci" <m.v.b@...box.com>
To:     Thomas Gleixner <tglx@...utronix.de>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc:     "Charles (Chas) Williams" <ciwillia@...cade.com>,
        "x86@...nel.org" <x86@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Borislav Petkov <bp@...en8.de>
Subject: Re: [PATCH] x86/cpuid: Deal with broken firmware once more

On 11/09/2016 06:35 PM, Thomas Gleixner wrote:
> Both ACPI and MP specifications require that the APIC id in the respective
> tables must be the same as the APIC id in CPUID. 
> 
> The kernel retrieves the physical package id from the APIC id during the
> ACPI/MP table scan and builds the physical to logical package map.
> 
> There exist Virtualbox and Xen implementations which violate the spec. As a
> result the physical to logical package map, which relies on the ACPI/MP
> tables does not work on those systems, because the CPUID initialized
> physical package id does not match the firmware id. This causes system
> crashes and malfunction due to invalid package mappings.
> 
> The only way to cure this is to sanitize the physical package id after the
> CPUID enumeration and yell when the APIC ids are different. If the physical
> package IDs differ use the package information from the ACPI/MP tables so
> the existing logical package map just works.
> 
> Reported-by: "Charles (Chas) Williams" <ciwillia@...cade.com>,
> Reported-by: M. Vefa Bicakci <m.v.b@...box.com>
> Signed-off-by: Thomas Gleixner <tglx@...utronix.de>

Hello Thomas and Sebastian,

Sorry for the delay in reporting what I have found out and for the delay in testing this patch, which has been due to my health.

I have found that your patch unfortunately does not improve the situation for me. Here is an excerpt obtained from the dmesg of a kernel compiled with this patch *as well as* Sebastian's patch:

=== 8< ===
[    0.002561] CPU: Physical Processor ID: 0

[    0.002566] CPU: Processor Core ID: 0

[    0.002572] [Firmware Bug]: CPU0: APIC id mismatch. Firmware: ffff CPUID: 2

[    0.002577] [Firmware Bug]: CPU0: Using firmware package id 4095 instead of 0

[    0.002586] Last level iTLB entries: 4KB 512, 2MB 8, 4MB 8

[    0.002591] Last level dTLB entries: 4KB 512, 2MB 32, 4MB 32, 1GB 0

[    0.033319] ftrace: allocating 28753 entries in 113 pages

[    0.040121] cpu 0 spinlock event irq 1

[    0.040145] smpboot: Max logical packages: 1

[    0.040155] VPMU disabled by hypervisor.

[    0.040181] Performance Events: unsupported p6 CPU model 42 no PMU driver, software events only.

[    0.047050] NMI watchdog: disabled (cpu0): hardware events not enabled

[    0.047065] NMI watchdog: Shutting down hard lockup detector on all cpus

[    0.052015] installing Xen timer for CPU 1

[    0.052074] SMP alternatives: switching to SMP code

[    0.002000] Disabled fast string operations

[    0.002000] [Firmware Bug]: CPU1: APIC id mismatch. Firmware: ffff CPUID: 2

[    0.002000] [Firmware Bug]: CPU1: Using firmware package id 4095 instead of 0

[    0.002000] smpboot: APIC(ffff) Converting physical 4095 to logical package 0

[    0.078061] cpu 1 spinlock event irq 13

...
[    0.216404] Freeing initrd memory: 4340K (ffff880001fa7000 - ffff8800023e4000)

[    0.216487] RAPL PMU: rapl pmu error: max package: 1 but CPU0 belongs to 65535

[    0.217572] futex hash table entries: 512 (order: 3, 32768 bytes)

[    0.218293] Initialise system trusted keyrings

...
[    0.216404] Freeing initrd memory: 4340K (ffff880001fa7000 - ffff8800023e4000)

[    0.216487] RAPL PMU: rapl pmu error: max package: 1 but CPU0 belongs to 65535

[    0.217572] futex hash table entries: 512 (order: 3, 32768 bytes)

...
[    2.974474] intel_rapl: Found RAPL domain package

[    2.974489] intel_rapl: Found RAPL domain core

[    2.974498] intel_rapl: Found RAPL domain uncore

[    2.974518] intel_rapl: RAPL package 4095 domain package locked by BIOS

=== >8 ===

As you can see, your patch unfortunately does not correct the issue with the virtual CPU package identifiers in Xen-based virtual machines using para-virtualization.

In summary, the root cause of the issue for me appears to be that the boot-up code in the init_apic_mappings function switches the APIC 'ops' structure from Xen's 'ops' structure to the no-op ops structure. Due to this, the smp_init_package_map uses the no-op APIC ops structure's cpu_present_to_to_apicid function, even though it should use the corresponding method from Xen's APIC ops structure (i.e., xen_cpu_present_to_apicid).

Here is a dmesg excerpt with a kernel patched with Sebastian's patch (and some debugging code), exhibiting the issue I have just explained: (Note the 'switched to apic NOOP' line.)

=== 8< ===
(early) [    0.000000] SFI: Simple Firmware Interface v0.81 http://simplefirmware.org
(early) [    0.000000] smpboot: Allowing 2 CPUs, 0 hotplug CPUs
(early) [    0.000000] No local APIC present
(early) [    0.000000] APIC: disable apic facility
(early) [    0.000000] APIC: switched to apic NOOP
(early) [    0.000000] e820: [mem 0xfa000000-0xffffffff] available for PCI devices
(early) [    0.000000] Booting paravirtualized kernel on Xen
...
[    0.034082] mvb: kernel_init_freeable:1007: About to call smp_prepare_cpus...
[    0.034123] cpu 0 spinlock event irq 1
[    0.034138] smpboot: mvb: smp_init_package_map:372: max_physical_pkg_id, after set: 4096
[    0.034146] mvb: __default_cpu_present_to_apicid:612: Returning 65535! mps_cpu: 0, nr_cpu_ids: 2, cpu_present(mps_cpu): 1
[    0.034155] smpboot: mvb: smp_init_package_map:379: apicid: 65535, apic_id_valid(apicid):0
[    0.034162] smpboot: Max logical packages: 1
[    0.034169] VPMU disabled by hypervisor.
[    0.034187] Performance Events: unsupported p6 CPU model 42 no PMU driver, software events only.
[    0.041131] NMI watchdog: disabled (cpu0): hardware events not enabled
[    0.041142] NMI watchdog: Shutting down hard lockup detector on all cpus
[    0.046024] installing Xen timer for CPU 1
[    0.046121] SMP alternatives: switching to SMP code
[    0.002000] Disabled fast string operations
[    0.002000] mvb: detect_ht: CPU has hyper-threading capability
[    0.002000] mvb: CPU: Physical Processor ID: 0
[    0.002000] mvb: CPU: Processor Core ID: 0
[    0.002000] mvb: identify_cpu:1112: c: ffff880013b0a040, c->logical_proc_id: 65535
[    0.002000] mvb: __default_cpu_present_to_apicid:612: Returning 65535! mps_cpu: 1, nr_cpu_ids: 2, cpu_present(mps_cpu): 1
[    0.002000] smpboot: mvb: topology_update_package_map:270: cpu: 1, pkg: 4095
[    0.002000] smpboot: APIC(ffff) Converting physical 4095 to logical package 0
[    0.002000] smpboot: mvb: topology_update_package_map:305: cpu: 1, cpu_data(cpu).logical_proc_id: 0
...
[    0.266540] RAPL PMU: mvb: init_rapl_pmus:686: rapl pmu: max package: 1
[    0.266547] RAPL PMU: mvb: rapl pmu: CPU0 (ffff880013a0a040) belongs to package ID 65535.
[    0.266559] RAPL PMU: mvb: rapl pmu: Package ID >= max package for CPU0 (ffff880013a0a040). This is an error.
[    0.266569] RAPL PMU: mvb: rapl pmu: CPU1 (ffff880013b0a040) belongs to package ID 0.
=== >8 ===

Through some debugging, last week I came up with the patch at the end of this e-mail, which does correct the issue for me. Once again, I am sorry for reporting this too late.

With a kernel built with the patch below, Sebastian's patch and some debugging code, I obtain the following dmesg output, which appears to be correct and does not emit RAPL-related errors during boot-up:

=== 8< ===
[    0.032083] mvb: kernel_init_freeable:1007: About to call smp_prepare_cpus...

[    0.032106] cpu 0 spinlock event irq 1

[    0.032119] smpboot: mvb: smp_init_package_map:372: max_physical_pkg_id, after set: 4096

[    0.032127] mvb: xen_cpu_present_to_apicid:151: cpu: 0, ret: 0

[    0.032132] smpboot: mvb: topology_update_package_map:270: cpu: 0, pkg: 0

[    0.032137] smpboot: APIC(0) Converting physical 0 to logical package 0

[    0.032142] smpboot: mvb: topology_update_package_map:305: cpu: 0, cpu_data(cpu).logical_proc_id: 0

[    0.032149] smpboot: mvb: smp_init_package_map:385: topology_update_package_map successful.

[    0.032155] smpboot: Max logical packages: 1

[    0.032161] VPMU disabled by hypervisor.

[    0.032177] Performance Events: unsupported p6 CPU model 42 no PMU driver, software events only.

[    0.039061] NMI watchdog: disabled (cpu0): hardware events not enabled

[    0.039102] NMI watchdog: Shutting down hard lockup detector on all cpus

[    0.045048] installing Xen timer for CPU 1

[    0.045145] SMP alternatives: switching to SMP code

[    0.002000] Disabled fast string operations

[    0.002000] mvb: detect_ht: CPU has hyper-threading capability

[    0.002000] mvb: CPU: Physical Processor ID: 0

[    0.002000] mvb: CPU: Processor Core ID: 1

[    0.002000] mvb: identify_cpu:1112: c: ffff88001790a040, c->logical_proc_id: 0

[    0.002000] mvb: xen_cpu_present_to_apicid:151: cpu: 1, ret: 0

[    0.002000] smpboot: mvb: topology_update_package_map:270: cpu: 1, pkg: 0

[    0.002000] smpboot: mvb: topology_update_package_map:305: cpu: 1, cpu_data(cpu).logical_proc_id: 0

[    0.067149] cpu 1 spinlock event irq 13

...
[    0.208395] RAPL PMU: mvb: init_rapl_pmus:686: rapl pmu: max package: 1

[    0.208400] RAPL PMU: mvb: rapl pmu: CPU0 (ffff88001780a040) belongs to package ID 0.

[    0.208409] RAPL PMU: mvb: rapl pmu: CPU1 (ffff88001790a040) belongs to package ID 0.

[    0.208491] RAPL PMU: API unit is 2^-32 Joules, 3 fixed counters, 163840 ms ovfl timer

[    0.208507] RAPL PMU: hw unit of domain pp0-core 2^-16 Joules

[    0.208512] RAPL PMU: hw unit of domain package 2^-16 Joules

[    0.208517] RAPL PMU: hw unit of domain pp1-gpu 2^-16 Joules

[    0.209083] futex hash table entries: 512 (order: 3, 32768 bytes)

=== >8 ===

My patch may not be a very elegant correction for the issue at hand, unfortunately (and it probably does not comply with the kernel patch submission guidelines).

In addition, I think Sebastian's patch should be included in the mainline kernel, so that potential bugs do not bring down the kernel with a RAPL-related oops at boot-up time with virtualization.

Please note that I will need to be offline for most of the day today (my current timezone is UTC+03:00), and as a result my responses to your replies will most likely be a bit late.

Thank you,

Vefa


>From b7097820285ab6a8588879969d74c56d890d4fd4 Mon Sep 17 00:00:00 2001
From: "M. Vefa Bicakci" <m.v.b@...box.com>
Date: Fri, 4 Nov 2016 10:01:19 +0300
Subject: [PATCH] Do not reset apic to apic_noop with Xen

---
 arch/x86/include/asm/apic.h | 3 +++
 arch/x86/kernel/apic/apic.c | 7 +++++--
 arch/x86/xen/apic.c         | 4 +++-
 3 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index b3d4c042e610..8c37580b7eb7 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -359,6 +359,9 @@ struct apic {
  */
 extern struct apic *apic;
 
+/* Indicates whether a hypervisor has already set 'apic'. */
+extern int apic_set_by_hypervisor;
+
 /*
  * APIC drivers are probed based on how they are listed in the .apicdrivers
  * section. So the order is important and enforced by the ordering
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 076c315cdf18..a3a1d4570acf 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -172,6 +172,8 @@ unsigned long mp_lapic_addr;
 int disable_apic;
 /* Disable local APIC timer from the kernel commandline or via dmi quirk */
 static int disable_apic_timer __initdata;
+/* Indicates whether a hypervisor has already set 'apic'. */
+int apic_set_by_hypervisor;
 /* Local APIC timer works in C2 */
 int local_apic_timer_c2_ok;
 EXPORT_SYMBOL_GPL(local_apic_timer_c2_ok);
@@ -1788,8 +1790,9 @@ void __init init_apic_mappings(void)
 		return;
 	}
 
-	/* If no local APIC can be found return early */
-	if (!smp_found_config && detect_init_APIC()) {
+	if (apic_set_by_hypervisor) {
+		/* Hypervisor has already taken care of the APIC. */
+	} else if (!smp_found_config && detect_init_APIC()) {
 		/* lets NOP'ify apic operations */
 		pr_info("APIC: disable apic facility\n");
 		apic_disable();
diff --git a/arch/x86/xen/apic.c b/arch/x86/xen/apic.c
index 5ac792a1d419..b79a003c13a5 100644
--- a/arch/x86/xen/apic.c
+++ b/arch/x86/xen/apic.c
@@ -229,8 +229,10 @@ void __init xen_init_apic(void)
 	x86_io_apic_ops.read = xen_io_apic_read;
 	/* On PV guests the APIC CPUID bit is disabled so none of the
 	 * routines end up executing. */
-	if (!xen_initial_domain())
+	if (!xen_initial_domain()) {
 		apic = &xen_pv_apic;
+		apic_set_by_hypervisor = 1;
+	}
 
 	x86_platform.apic_post_init = xen_apic_check;
 }
-- 
2.5.5

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ