[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3bfacf45d2d0f3dfa3789ff5a2dcb46744aacff7.camel@infradead.org>
Date: Tue, 28 Dec 2021 14:18:57 +0000
From: David Woodhouse <dwmw2@...radead.org>
To: Paul Menzel <pmenzel@...gen.mpg.de>,
Thomas Gleixner <tglx@...utronix.de>
Cc: Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
"H . Peter Anvin" <hpa@...or.com>,
Paolo Bonzini <pbonzini@...hat.com>,
"Paul E . McKenney" <paulmck@...nel.org>,
linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
rcu@...r.kernel.org, mimoja@...oja.de, hewenliang4@...wei.com,
hushiyuan@...wei.com, luolongjun@...wei.com, hejingxian@...wei.com
Subject: Re: [PATCH v3 0/9] Parallel CPU bringup for x86_64
On Tue, 2021-12-28 at 12:34 +0100, Paul Menzel wrote:
> Same on the ASUS F2A85-M PRO with AMD A6-6400K. Without serial console,
> the messages below are printed below to the monitor after nine seconds.
>
> [ 1.078879] smp: Bringing up secondary CPUs ...
> [ 1.080950] x86: Booting SMP configuration:
>
> Please find the serial log attached.
>
Thanks for testing. That looks like the same triple-fault on bringup
that we have been seeing, and that I reproduced without my patches
using kexec all the way back to a 5.0 kernel.
Out of interest, are you also able to reproduce it with kexec and
without the parallel bringup?
And with that patch I sent Tom in
https://lore.kernel.org/lkml/721484e0fa719e99f9b8f13e67de05033dd7cc86.camel@infradead.org/
to expand the bitlock exclusion and stop the bringup being truly in
parallel at all?
Or tbe one in
https://lore.kernel.org/lkml/d4cde50b4aab24612823714dfcbe69bc4bb63b60.camel@infradead.org
which makes it do nothing except prepare all the CPUs before bringing
them up one at a time?
My current theory (not that I've spent that much time thinking about it
in the last week) is that there's something about the existing CPU
bringup, possibly a CPU bug or something special about the AMD CPUs,
which is triggered by just making it a little bit *faster*, which is
why bringing them up from kexec (especially in qemu) can cause it too?
Tom seemed to find that it was in load_TR_desc(), so if you could try
this hack on a machine that doesn't magically wink out of existence on
a triplefault before even flushing its serial output, that would be
much appreciated...
diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h
index ab97b22ac04a..cc6590712ff4 100644
--- a/arch/x86/include/asm/desc.h
+++ b/arch/x86/include/asm/desc.h
@@ -8,7 +8,7 @@
#include <asm/fixmap.h>
#include <asm/irq_vectors.h>
#include <asm/cpu_entry_area.h>
-
+#include <asm/io.h>
#include <linux/debug_locks.h>
#include <linux/smp.h>
#include <linux/percpu.h>
@@ -265,11 +265,16 @@ static inline void native_load_tr_desc(void)
* If the current GDT is the read-only fixmap, swap to the original
* writeable version. Swap back at the end.
*/
+ outb('d', 0x3f8);
if (gdt.address == (unsigned long)fixmap_gdt) {
+ outb('e', 0x3f8);
load_direct_gdt(cpu);
restore = 1;
+ outb('f', 0x3f8);
}
+ outb('g', 0x3f8);
asm volatile("ltr %w0"::"q" (GDT_ENTRY_TSS*8));
+ outb('h', 0x3f8);
if (restore)
load_fixmap_gdt(cpu);
}
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 0083464de5e3..5bc8f30c3283 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -1716,7 +1716,9 @@ void identify_secondary_cpu(struct cpuinfo_x86 *c)
enable_sep_cpu();
#endif
mtrr_ap_init();
+outb('A', 0x3f8);
validate_apic_and_package_id(c);
+outb('B', 0x3f8);
x86_spec_ctrl_setup_ap();
update_srbds_msr();
}
@@ -1957,6 +1959,7 @@ static inline void tss_setup_io_bitmap(struct tss_struct *tss)
tss->io_bitmap.mapall[IO_BITMAP_LONGS] = ~0UL;
#endif
}
+#include <asm/realmode.h>
/*
* Setup everything needed to handle exceptions from the IDT, including the IST
@@ -1969,16 +1972,24 @@ void cpu_init_exception_handling(void)
/* paranoid_entry() gets the CPU number from the GDT */
setup_getcpu(cpu);
-
+ outb('\n', 0x3f8);
+ outb('0' + cpu / 100, 0x3f8);
+ outb('0' + (cpu % 100) / 10, 0x3f8);
+ outb('0' + (cpu % 10), 0x3f8);
+
/* IST vectors need TSS to be set up. */
tss_setup_ist(tss);
+ outb('a', 0x3f8);
tss_setup_io_bitmap(tss);
set_tss_desc(cpu, &get_cpu_entry_area(cpu)->tss.x86_tss);
-
+ outb('b', 0x3f8);
load_TR_desc();
+ outb('c', 0x3f8);
/* Finally load the IDT */
load_current_idt();
+ outb('z', 0x3f8);
+
}
/*
Download attachment "smime.p7s" of type "application/pkcs7-signature" (5174 bytes)
Powered by blists - more mailing lists