[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <49fe8cc5-0f0f-6cac-7a5c-803e81f5667d@runbox.com>
Date: Thu, 10 Nov 2016 06:57:00 +0300
From: "M. Vefa Bicakci" <m.v.b@...box.com>
To: Thomas Gleixner <tglx@...utronix.de>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: "Charles (Chas) Williams" <ciwillia@...cade.com>,
"x86@...nel.org" <x86@...nel.org>,
LKML <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
Borislav Petkov <bp@...en8.de>
Subject: Re: [PATCH] x86/cpuid: Deal with broken firmware once more
On 11/09/2016 06:35 PM, Thomas Gleixner wrote:
> Both ACPI and MP specifications require that the APIC id in the respective
> tables must be the same as the APIC id in CPUID.
>
> The kernel retrieves the physical package id from the APIC id during the
> ACPI/MP table scan and builds the physical to logical package map.
>
> There exist Virtualbox and Xen implementations which violate the spec. As a
> result the physical to logical package map, which relies on the ACPI/MP
> tables does not work on those systems, because the CPUID initialized
> physical package id does not match the firmware id. This causes system
> crashes and malfunction due to invalid package mappings.
>
> The only way to cure this is to sanitize the physical package id after the
> CPUID enumeration and yell when the APIC ids are different. If the physical
> package IDs differ use the package information from the ACPI/MP tables so
> the existing logical package map just works.
>
> Reported-by: "Charles (Chas) Williams" <ciwillia@...cade.com>,
> Reported-by: M. Vefa Bicakci <m.v.b@...box.com>
> Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
Hello Thomas and Sebastian,
Sorry for the delay in reporting what I have found out and for the delay in testing this patch, which has been due to my health.
I have found that your patch unfortunately does not improve the situation for me. Here is an excerpt obtained from the dmesg of a kernel compiled with this patch *as well as* Sebastian's patch:
=== 8< ===
[ 0.002561] CPU: Physical Processor ID: 0
[ 0.002566] CPU: Processor Core ID: 0
[ 0.002572] [Firmware Bug]: CPU0: APIC id mismatch. Firmware: ffff CPUID: 2
[ 0.002577] [Firmware Bug]: CPU0: Using firmware package id 4095 instead of 0
[ 0.002586] Last level iTLB entries: 4KB 512, 2MB 8, 4MB 8
[ 0.002591] Last level dTLB entries: 4KB 512, 2MB 32, 4MB 32, 1GB 0
[ 0.033319] ftrace: allocating 28753 entries in 113 pages
[ 0.040121] cpu 0 spinlock event irq 1
[ 0.040145] smpboot: Max logical packages: 1
[ 0.040155] VPMU disabled by hypervisor.
[ 0.040181] Performance Events: unsupported p6 CPU model 42 no PMU driver, software events only.
[ 0.047050] NMI watchdog: disabled (cpu0): hardware events not enabled
[ 0.047065] NMI watchdog: Shutting down hard lockup detector on all cpus
[ 0.052015] installing Xen timer for CPU 1
[ 0.052074] SMP alternatives: switching to SMP code
[ 0.002000] Disabled fast string operations
[ 0.002000] [Firmware Bug]: CPU1: APIC id mismatch. Firmware: ffff CPUID: 2
[ 0.002000] [Firmware Bug]: CPU1: Using firmware package id 4095 instead of 0
[ 0.002000] smpboot: APIC(ffff) Converting physical 4095 to logical package 0
[ 0.078061] cpu 1 spinlock event irq 13
...
[ 0.216404] Freeing initrd memory: 4340K (ffff880001fa7000 - ffff8800023e4000)
[ 0.216487] RAPL PMU: rapl pmu error: max package: 1 but CPU0 belongs to 65535
[ 0.217572] futex hash table entries: 512 (order: 3, 32768 bytes)
[ 0.218293] Initialise system trusted keyrings
...
[ 0.216404] Freeing initrd memory: 4340K (ffff880001fa7000 - ffff8800023e4000)
[ 0.216487] RAPL PMU: rapl pmu error: max package: 1 but CPU0 belongs to 65535
[ 0.217572] futex hash table entries: 512 (order: 3, 32768 bytes)
...
[ 2.974474] intel_rapl: Found RAPL domain package
[ 2.974489] intel_rapl: Found RAPL domain core
[ 2.974498] intel_rapl: Found RAPL domain uncore
[ 2.974518] intel_rapl: RAPL package 4095 domain package locked by BIOS
=== >8 ===
As you can see, your patch unfortunately does not correct the issue with the virtual CPU package identifiers in Xen-based virtual machines using para-virtualization.
In summary, the root cause of the issue for me appears to be that the boot-up code in the init_apic_mappings function switches the APIC 'ops' structure from Xen's 'ops' structure to the no-op ops structure. Due to this, the smp_init_package_map uses the no-op APIC ops structure's cpu_present_to_to_apicid function, even though it should use the corresponding method from Xen's APIC ops structure (i.e., xen_cpu_present_to_apicid).
Here is a dmesg excerpt with a kernel patched with Sebastian's patch (and some debugging code), exhibiting the issue I have just explained: (Note the 'switched to apic NOOP' line.)
=== 8< ===
(early) [ 0.000000] SFI: Simple Firmware Interface v0.81 http://simplefirmware.org
(early) [ 0.000000] smpboot: Allowing 2 CPUs, 0 hotplug CPUs
(early) [ 0.000000] No local APIC present
(early) [ 0.000000] APIC: disable apic facility
(early) [ 0.000000] APIC: switched to apic NOOP
(early) [ 0.000000] e820: [mem 0xfa000000-0xffffffff] available for PCI devices
(early) [ 0.000000] Booting paravirtualized kernel on Xen
...
[ 0.034082] mvb: kernel_init_freeable:1007: About to call smp_prepare_cpus...
[ 0.034123] cpu 0 spinlock event irq 1
[ 0.034138] smpboot: mvb: smp_init_package_map:372: max_physical_pkg_id, after set: 4096
[ 0.034146] mvb: __default_cpu_present_to_apicid:612: Returning 65535! mps_cpu: 0, nr_cpu_ids: 2, cpu_present(mps_cpu): 1
[ 0.034155] smpboot: mvb: smp_init_package_map:379: apicid: 65535, apic_id_valid(apicid):0
[ 0.034162] smpboot: Max logical packages: 1
[ 0.034169] VPMU disabled by hypervisor.
[ 0.034187] Performance Events: unsupported p6 CPU model 42 no PMU driver, software events only.
[ 0.041131] NMI watchdog: disabled (cpu0): hardware events not enabled
[ 0.041142] NMI watchdog: Shutting down hard lockup detector on all cpus
[ 0.046024] installing Xen timer for CPU 1
[ 0.046121] SMP alternatives: switching to SMP code
[ 0.002000] Disabled fast string operations
[ 0.002000] mvb: detect_ht: CPU has hyper-threading capability
[ 0.002000] mvb: CPU: Physical Processor ID: 0
[ 0.002000] mvb: CPU: Processor Core ID: 0
[ 0.002000] mvb: identify_cpu:1112: c: ffff880013b0a040, c->logical_proc_id: 65535
[ 0.002000] mvb: __default_cpu_present_to_apicid:612: Returning 65535! mps_cpu: 1, nr_cpu_ids: 2, cpu_present(mps_cpu): 1
[ 0.002000] smpboot: mvb: topology_update_package_map:270: cpu: 1, pkg: 4095
[ 0.002000] smpboot: APIC(ffff) Converting physical 4095 to logical package 0
[ 0.002000] smpboot: mvb: topology_update_package_map:305: cpu: 1, cpu_data(cpu).logical_proc_id: 0
...
[ 0.266540] RAPL PMU: mvb: init_rapl_pmus:686: rapl pmu: max package: 1
[ 0.266547] RAPL PMU: mvb: rapl pmu: CPU0 (ffff880013a0a040) belongs to package ID 65535.
[ 0.266559] RAPL PMU: mvb: rapl pmu: Package ID >= max package for CPU0 (ffff880013a0a040). This is an error.
[ 0.266569] RAPL PMU: mvb: rapl pmu: CPU1 (ffff880013b0a040) belongs to package ID 0.
=== >8 ===
Through some debugging, last week I came up with the patch at the end of this e-mail, which does correct the issue for me. Once again, I am sorry for reporting this too late.
With a kernel built with the patch below, Sebastian's patch and some debugging code, I obtain the following dmesg output, which appears to be correct and does not emit RAPL-related errors during boot-up:
=== 8< ===
[ 0.032083] mvb: kernel_init_freeable:1007: About to call smp_prepare_cpus...
[ 0.032106] cpu 0 spinlock event irq 1
[ 0.032119] smpboot: mvb: smp_init_package_map:372: max_physical_pkg_id, after set: 4096
[ 0.032127] mvb: xen_cpu_present_to_apicid:151: cpu: 0, ret: 0
[ 0.032132] smpboot: mvb: topology_update_package_map:270: cpu: 0, pkg: 0
[ 0.032137] smpboot: APIC(0) Converting physical 0 to logical package 0
[ 0.032142] smpboot: mvb: topology_update_package_map:305: cpu: 0, cpu_data(cpu).logical_proc_id: 0
[ 0.032149] smpboot: mvb: smp_init_package_map:385: topology_update_package_map successful.
[ 0.032155] smpboot: Max logical packages: 1
[ 0.032161] VPMU disabled by hypervisor.
[ 0.032177] Performance Events: unsupported p6 CPU model 42 no PMU driver, software events only.
[ 0.039061] NMI watchdog: disabled (cpu0): hardware events not enabled
[ 0.039102] NMI watchdog: Shutting down hard lockup detector on all cpus
[ 0.045048] installing Xen timer for CPU 1
[ 0.045145] SMP alternatives: switching to SMP code
[ 0.002000] Disabled fast string operations
[ 0.002000] mvb: detect_ht: CPU has hyper-threading capability
[ 0.002000] mvb: CPU: Physical Processor ID: 0
[ 0.002000] mvb: CPU: Processor Core ID: 1
[ 0.002000] mvb: identify_cpu:1112: c: ffff88001790a040, c->logical_proc_id: 0
[ 0.002000] mvb: xen_cpu_present_to_apicid:151: cpu: 1, ret: 0
[ 0.002000] smpboot: mvb: topology_update_package_map:270: cpu: 1, pkg: 0
[ 0.002000] smpboot: mvb: topology_update_package_map:305: cpu: 1, cpu_data(cpu).logical_proc_id: 0
[ 0.067149] cpu 1 spinlock event irq 13
...
[ 0.208395] RAPL PMU: mvb: init_rapl_pmus:686: rapl pmu: max package: 1
[ 0.208400] RAPL PMU: mvb: rapl pmu: CPU0 (ffff88001780a040) belongs to package ID 0.
[ 0.208409] RAPL PMU: mvb: rapl pmu: CPU1 (ffff88001790a040) belongs to package ID 0.
[ 0.208491] RAPL PMU: API unit is 2^-32 Joules, 3 fixed counters, 163840 ms ovfl timer
[ 0.208507] RAPL PMU: hw unit of domain pp0-core 2^-16 Joules
[ 0.208512] RAPL PMU: hw unit of domain package 2^-16 Joules
[ 0.208517] RAPL PMU: hw unit of domain pp1-gpu 2^-16 Joules
[ 0.209083] futex hash table entries: 512 (order: 3, 32768 bytes)
=== >8 ===
My patch may not be a very elegant correction for the issue at hand, unfortunately (and it probably does not comply with the kernel patch submission guidelines).
In addition, I think Sebastian's patch should be included in the mainline kernel, so that potential bugs do not bring down the kernel with a RAPL-related oops at boot-up time with virtualization.
Please note that I will need to be offline for most of the day today (my current timezone is UTC+03:00), and as a result my responses to your replies will most likely be a bit late.
Thank you,
Vefa
>From b7097820285ab6a8588879969d74c56d890d4fd4 Mon Sep 17 00:00:00 2001
From: "M. Vefa Bicakci" <m.v.b@...box.com>
Date: Fri, 4 Nov 2016 10:01:19 +0300
Subject: [PATCH] Do not reset apic to apic_noop with Xen
---
arch/x86/include/asm/apic.h | 3 +++
arch/x86/kernel/apic/apic.c | 7 +++++--
arch/x86/xen/apic.c | 4 +++-
3 files changed, 11 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index b3d4c042e610..8c37580b7eb7 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -359,6 +359,9 @@ struct apic {
*/
extern struct apic *apic;
+/* Indicates whether a hypervisor has already set 'apic'. */
+extern int apic_set_by_hypervisor;
+
/*
* APIC drivers are probed based on how they are listed in the .apicdrivers
* section. So the order is important and enforced by the ordering
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 076c315cdf18..a3a1d4570acf 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -172,6 +172,8 @@ unsigned long mp_lapic_addr;
int disable_apic;
/* Disable local APIC timer from the kernel commandline or via dmi quirk */
static int disable_apic_timer __initdata;
+/* Indicates whether a hypervisor has already set 'apic'. */
+int apic_set_by_hypervisor;
/* Local APIC timer works in C2 */
int local_apic_timer_c2_ok;
EXPORT_SYMBOL_GPL(local_apic_timer_c2_ok);
@@ -1788,8 +1790,9 @@ void __init init_apic_mappings(void)
return;
}
- /* If no local APIC can be found return early */
- if (!smp_found_config && detect_init_APIC()) {
+ if (apic_set_by_hypervisor) {
+ /* Hypervisor has already taken care of the APIC. */
+ } else if (!smp_found_config && detect_init_APIC()) {
/* lets NOP'ify apic operations */
pr_info("APIC: disable apic facility\n");
apic_disable();
diff --git a/arch/x86/xen/apic.c b/arch/x86/xen/apic.c
index 5ac792a1d419..b79a003c13a5 100644
--- a/arch/x86/xen/apic.c
+++ b/arch/x86/xen/apic.c
@@ -229,8 +229,10 @@ void __init xen_init_apic(void)
x86_io_apic_ops.read = xen_io_apic_read;
/* On PV guests the APIC CPUID bit is disabled so none of the
* routines end up executing. */
- if (!xen_initial_domain())
+ if (!xen_initial_domain()) {
apic = &xen_pv_apic;
+ apic_set_by_hypervisor = 1;
+ }
x86_platform.apic_post_init = xen_apic_check;
}
--
2.5.5
Powered by blists - more mailing lists