lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <22a9d92f51e0c6f8d4a3928b91f7f75e0297b93a.camel@infradead.org>
Date:   Thu, 21 Jan 2021 15:42:19 +0000
From:   David Woodhouse <dwmw2@...radead.org>
To:     Thomas Gleixner <tglx@...utronix.de>,
        Andy Lutomirski <luto@...nel.org>,
        "shenkai (D)" <shenkai8@...wei.com>,
        "Schander, Johanna 'Mimoja' Amelie" <mimoja@...zon.com>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        X86 ML <x86@...nel.org>, "H. Peter Anvin" <hpa@...or.com>,
        hewenliang4@...wei.com, hushiyuan@...wei.com,
        luolongjun@...wei.com, hejingxian@...wei.com
Subject: Re: [PATCH] use x86 cpu park to speedup smp_init in kexec situation

On Thu, 2021-01-21 at 15:55 +0100, Thomas Gleixner wrote:
> > Testing on real hardware has been more interesting and less useful so
> > far. We started with the CPUHP_BRINGUP_KICK_CPU state being
> > *immediately* before CPUHP_BRINGUP_CPU. On my 28-thread Haswell box,
> > that didn't come up at all even without actually *doing* anything in
> > the pre-bringup phase. Merely bringing all the AP threads up through
> > the various CPUHP_PREPARE_foo stages before actually bringing them
> > online, was enough to break it. I have no serial port on this box so we
> > haven't get worked out why; I've resorted to putting the
> > CPUHP_BRINGUP_KICK_CPU state before CPUHP_WORKQUEUE_PREP instead.
> 
> Hrm.

Aha, I managed to reproduce in qemu. It's CPUHP_X2APIC_PREPARE, which
is only used in x2apic *cluster* mode not physical mode. So I actually
need to give the guest an IOMMU with IRQ remapping before I see it.


$ git diff
diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index bc56287a1ed1..f503e66b4718 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -92,6 +92,7 @@ enum cpuhp_state {
        CPUHP_MIPS_SOC_PREPARE,
        CPUHP_BP_PREPARE_DYN,
        CPUHP_BP_PREPARE_DYN_END                = CPUHP_BP_PREPARE_DYN + 20,
+       CPUHP_BRINGUP_WAKE_CPU,
        CPUHP_BRINGUP_CPU,
        CPUHP_AP_IDLE_DEAD,
        CPUHP_AP_OFFLINE,
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 2b8d7a5db383..6c6f2986bfdb 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1336,6 +1336,12 @@ void bringup_nonboot_cpus(unsigned int setup_max_cpus)
 {
        unsigned int cpu;
 
+       for_each_present_cpu(cpu) {
+               if (num_online_cpus() >= setup_max_cpus)
+                       break;
+               if (!cpu_online(cpu))
+                       cpu_up(cpu, CPUHP_BRINGUP_WAKE_CPU);
+       }
        for_each_present_cpu(cpu) {
                if (num_online_cpus() >= setup_max_cpus)
                        break;
$ qemu-system-x86_64 -kernel arch/x86/boot/bzImage -append "console=ttyS0  trace_event=cpuhp tp_printk" -display none -serial mon:stdio  -m 2G -M q35,accel=kvm,kernel-irqchip=split -device intel-iommu,intremap=on -smp 40
...
[    0.349968] smp: Bringing up secondary CPUs ...
[    0.350281] cpuhp_enter: cpu: 0001 target:  42 step:   1 (smpboot_create_threads)
[    0.351421] cpuhp_exit:  cpu: 0001  state:   1 step:   1 ret: 0
[    0.352074] cpuhp_enter: cpu: 0001 target:  42 step:   2 (perf_event_init_cpu)
[    0.352276] cpuhp_exit:  cpu: 0001  state:   2 step:   2 ret: 0
[    0.353273] cpuhp_enter: cpu: 0001 target:  42 step:  37 (workqueue_prepare_cpu)
[    0.354377] cpuhp_exit:  cpu: 0001  state:  37 step:  37 ret: 0
[    0.355273] cpuhp_enter: cpu: 0001 target:  42 step:  39 (hrtimers_prepare_cpu)
[    0.356271] cpuhp_exit:  cpu: 0001  state:  39 step:  39 ret: 0
[    0.356937] cpuhp_enter: cpu: 0001 target:  42 step:  41 (x2apic_prepare_cpu)
[    0.357277] cpuhp_exit:  cpu: 0001  state:  41 step:  41 ret: 0
[    0.358278] cpuhp_enter: cpu: 0002 target:  42 step:   1 (smpboot_create_threads)
...
[    0.614278] cpuhp_enter: cpu: 0032 target:  42 step:   1 (smpboot_create_threads)
[    0.615610] cpuhp_exit:  cpu: 0032  state:   1 step:   1 ret: 0
[    0.616274] cpuhp_enter: cpu: 0032 target:  42 step:   2 (perf_event_init_cpu)
[    0.617271] cpuhp_exit:  cpu: 0032  state:   2 step:   2 ret: 0
[    0.618272] cpuhp_enter: cpu: 0032 target:  42 step:  37 (workqueue_prepare_cpu)
[    0.619388] cpuhp_exit:  cpu: 0032  state:  37 step:  37 ret: 0
[    0.620273] cpuhp_enter: cpu: 0032 target:  42 step:  39 (hrtimers_prepare_cpu)
[    0.621270] cpuhp_exit:  cpu: 0032  state:  39 step:  39 ret: 0
[    0.622009] cpuhp_enter: cpu: 0032 target:  42 step:  41 (x2apic_prepare_cpu)
[    0.622275] cpuhp_exit:  cpu: 0032  state:  41 step:  41 ret: 0
...
[    0.684272] cpuhp_enter: cpu: 0039 target:  42 step:  41 (x2apic_prepare_cpu)
[    0.685277] cpuhp_exit:  cpu: 0039  state:  41 step:  41 ret: 0
[    0.685979] cpuhp_enter: cpu: 0001 target: 217 step:  43 (smpcfd_prepare_cpu)
[    0.686283] cpuhp_exit:  cpu: 0001  state:  43 step:  43 ret: 0
[    0.687274] cpuhp_enter: cpu: 0001 target: 217 step:  44 (relay_prepare_cpu)
[    0.688274] cpuhp_exit:  cpu: 0001  state:  44 step:  44 ret: 0
[    0.689274] cpuhp_enter: cpu: 0001 target: 217 step:  47 (rcutree_prepare_cpu)
[    0.690271] cpuhp_exit:  cpu: 0001  state:  47 step:  47 ret: 0
[    0.690982] cpuhp_multi_enter: cpu: 0001 target: 217 step:  59 (trace_rb_cpu_prepare)
[    0.691281] cpuhp_exit:  cpu: 0001  state:  59 step:  59 ret: 0
[    0.692272] cpuhp_multi_enter: cpu: 0001 target: 217 step:  59 (trace_rb_cpu_prepare)
[    0.694640] cpuhp_exit:  cpu: 0001  state:  59 step:  59 ret: 0
[    0.695272] cpuhp_multi_enter: cpu: 0001 target: 217 step:  59 (trace_rb_cpu_prepare)
[    0.696280] cpuhp_exit:  cpu: 0001  state:  59 step:  59 ret: 0
[    0.697279] cpuhp_enter: cpu: 0001 target: 217 step:  65 (timers_prepare_cpu)
[    0.698168] cpuhp_exit:  cpu: 0001  state:  65 step:  65 ret: 0
[    0.698272] cpuhp_enter: cpu: 0001 target: 217 step:  67 (kvmclock_setup_percpu)
[    0.699270] cpuhp_exit:  cpu: 0001  state:  67 step:  67 ret: 0
[    0.700272] cpuhp_enter: cpu: 0001 target: 217 step:  88 (bringup_cpu)
[    0.701312] x86: Booting SMP configuration:
[    0.702270] .... node  #0, CPUs:        #1
[    0.127218] kvm-clock: cpu 1, msr 59401041, secondary cpu clock
[    0.127218] smpboot: CPU 1 Converting physical 0 to logical die 1
[    0.709281] cpuhp_enter: cpu: 0001 target: 217 step: 147 (smpboot_unpark_threads)
[    0.712294] cpuhp_exit:  cpu: 0001  state: 147 step: 147 ret: 0
[    0.714283] cpuhp_enter: cpu: 0001 target: 217 step: 149 (irq_affinity_online_cpu)
[    0.717292] cpuhp_exit:  cpu: 0001  state: 149 step: 149 ret: 0
[    0.719283] cpuhp_enter: cpu: 0001 target: 217 step: 153 (perf_event_init_cpu)
[    0.721279] cpuhp_exit:  cpu: 0001  state: 153 step: 153 ret: 0
[    0.724285] cpuhp_enter: cpu: 0001 target: 217 step: 179 (lockup_detector_online_cpu)
[    0.727279] cpuhp_exit:  cpu: 0001  state: 179 step: 179 ret: 0
[    0.729279] cpuhp_enter: cpu: 0001 target: 217 step: 180 (workqueue_online_cpu)
[    0.731309] cpuhp_exit:  cpu: 0001  state: 180 step: 180 ret: 0
[    0.733281] cpuhp_enter: cpu: 0001 target: 217 step: 181 (rcutree_online_cpu)
[    0.735276] cpuhp_exit:  cpu: 0001  state: 181 step: 181 ret: 0
[    0.737278] cpuhp_enter: cpu: 0001 target: 217 step: 183 (kvm_cpu_online)
[    0.739286] kvm-guest: stealtime: cpu 1, msr 7d46c080
[    0.740274] cpuhp_exit:  cpu: 0001  state: 183 step: 183 ret: 0
[    0.742278] cpuhp_enter: cpu: 0001 target: 217 step: 184 (page_writeback_cpu_online)
[    0.744275] cpuhp_exit:  cpu: 0001  state: 184 step: 184 ret: 0
[    0.745277] cpuhp_enter: cpu: 0001 target: 217 step: 185 (vmstat_cpu_online)
[    0.747276] cpuhp_exit:  cpu: 0001  state: 185 step: 185 ret: 0
[    0.749280] cpuhp_enter: cpu: 0001 target: 217 step: 216 (sched_cpu_activate)
[    0.750275] cpuhp_exit:  cpu: 0001  state: 216 step: 216 ret: 0
[    0.752273] cpuhp_exit:  cpu: 0001  state: 217 step:  88 ret: 0
[    0.753030] cpuhp_enter: cpu: 0002 target: 217 step:  43 (smpcfd_prepare_cpu)
...
[    2.311273] cpuhp_exit:  cpu: 0031  state: 217 step:  88 ret: 0
[    2.312278] cpuhp_enter: cpu: 0032 target: 217 step:  43 (smpcfd_prepare_cpu)
[    2.313119] cpuhp_exit:  cpu: 0032  state:  43 step:  43 ret: 0
[    2.313277] cpuhp_enter: cpu: 0032 target: 217 step:  44 (relay_prepare_cpu)
[    2.314275] cpuhp_exit:  cpu: 0032  state:  44 step:  44 ret: 0
[    2.315274] cpuhp_enter: cpu: 0032 target: 217 step:  47 (rcutree_prepare_cpu)
[    2.316104] cpuhp_exit:  cpu: 0032  state:  47 step:  47 ret: 0
[    2.316273] cpuhp_multi_enter: cpu: 0032 target: 217 step:  59 (trace_rb_cpu_prepare)
[    2.317292] cpuhp_exit:  cpu: 0032  state:  59 step:  59 ret: 0
[    2.318275] cpuhp_multi_enter: cpu: 0032 target: 217 step:  59 (trace_rb_cpu_prepare)
[    2.320401] cpuhp_exit:  cpu: 0032  state:  59 step:  59 ret: 0
[    2.321111] cpuhp_multi_enter: cpu: 0032 target: 217 step:  59 (trace_rb_cpu_prepare)
[    2.321286] cpuhp_exit:  cpu: 0032  state:  59 step:  59 ret: 0
[    2.322273] cpuhp_enter: cpu: 0032 target: 217 step:  65 (timers_prepare_cpu)
[    2.323271] cpuhp_exit:  cpu: 0032  state:  65 step:  65 ret: 0
[    2.324272] cpuhp_enter: cpu: 0032 target: 217 step:  67 (kvmclock_setup_percpu)
[    2.325133] cpuhp_exit:  cpu: 0032  state:  67 step:  67 ret: 0
[    2.325273] cpuhp_enter: cpu: 0032 target: 217 step:  88 (bringup_cpu)
[    2.326292]  #32
[    2.289283] kvm-clock: cpu 32, msr 59401801, secondary cpu clock
[    2.289283] BUG: kernel NULL pointer dereference, address: 0000000000000000
[    2.289283] #PF: supervisor write access in kernel mode
[    2.289283] #PF: error_code(0x0002) - not-present page
[    2.289283] PGD 0 P4D 0 
[    2.289283] Oops: 0002 [#1] SMP PTI
[    2.289283] CPU: 32 PID: 0 Comm: swapper/32 Not tainted 5.10.0+ #745
[    2.289283] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-1.fc33 04/01/2014
[    2.289283] RIP: 0010:init_x2apic_ldr+0xa0/0xb0
[    2.289283] Code: 89 2d 9c 81 fb 72 65 8b 15 cd 12 fb 72 89 d2 f0 48 0f ab 50 08 5b 5d c3 48 8b 05 a3 7b 09 02 48 c7 05 98 7b 09 02 00 00 00 00 <89> 18 eb cd 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00 00 89
[    2.289283] RSP: 0000:ffffb15e8016fec0 EFLAGS: 00010046
[    2.289283] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000040
[    2.289283] RDX: 00000000ffffffff RSI: 0000000000000000 RDI: 0000000000000028
[    2.289283] RBP: 0000000000018428 R08: 0000000000000000 R09: 0000000000000028
[    2.289283] R10: ffffb15e8016fd78 R11: ffff88ca7ff28368 R12: 0000000000000200
[    2.289283] R13: 0000000000000020 R14: 0000000000000000 R15: 0000000000000000
[    2.289283] FS:  0000000000000000(0000) GS:ffff88ca7dc00000(0000) knlGS:0000000000000000
[    2.289283] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    2.289283] CR2: 0000000000000000 CR3: 0000000058610000 CR4: 00000000000006a0
[    2.289283] Call Trace:
[    2.289283]  setup_local_APIC+0x88/0x320
[    2.289283]  ? printk+0x48/0x4a
[    2.289283]  apic_ap_setup+0xa/0x20
[    2.289283]  start_secondary+0x2f/0x130
[    2.289283]  secondary_startup_64_no_verify+0xc2/0xcb
[    2.289283] Modules linked in:
[    2.289283] CR2: 0000000000000000
[    2.289283] ---[ end trace 676dcdbf63e55075 ]---
[    2.289283] RIP: 0010:init_x2apic_ldr+0xa0/0xb0
[    2.289283] Code: 89 2d 9c 81 fb 72 65 8b 15 cd 12 fb 72 89 d2 f0 48 0f ab 50 08 5b 5d c3 48 8b 05 a3 7b 09 02 48 c7 05 98 7b 09 02 00 00 00 00 <89> 18 eb cd 66 66 2e 0f 1f 84 00 00 00 00 00 90 0f 1f 44 00 00 89
[    2.289283] RSP: 0000:ffffb15e8016fec0 EFLAGS: 00010046
[    2.289283] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000040
[    2.289283] RDX: 00000000ffffffff RSI: 0000000000000000 RDI: 0000000000000028
[    2.289283] RBP: 0000000000018428 R08: 0000000000000000 R09: 0000000000000028
[    2.289283] R10: ffffb15e8016fd78 R11: ffff88ca7ff28368 R12: 0000000000000200
[    2.289283] R13: 0000000000000020 R14: 0000000000000000 R15: 0000000000000000
[    2.289283] FS:  0000000000000000(0000) GS:ffff88ca7dc00000(0000) knlGS:0000000000000000
[    2.289283] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    2.289283] CR2: 0000000000000000 CR3: 0000000058610000 CR4: 00000000000006a0
[    2.289283] Kernel panic - not syncing: Attempted to kill the idle task!
[    2.289283] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---

Download attachment "smime.p7s" of type "application/x-pkcs7-signature" (5174 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ