lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8e2f03e2-9517-aeb4-df60-b36ef3ff3a75@bytedance.com>
Date:   Thu, 9 Feb 2023 09:49:04 +0000
From:   Usama Arif <usama.arif@...edance.com>
To:     paulmck@...nel.org, dwmw2@...radead.org, tglx@...utronix.de
Cc:     kim.phillips@....com, arjan@...ux.intel.com, mingo@...hat.com,
        bp@...en8.de, dave.hansen@...ux.intel.com, hpa@...or.com,
        x86@...nel.org, pbonzini@...hat.com, linux-kernel@...r.kernel.org,
        kvm@...r.kernel.org, rcu@...r.kernel.org, mimoja@...oja.de,
        hewenliang4@...wei.com, thomas.lendacky@....com, seanjc@...gle.com,
        pmenzel@...gen.mpg.de, fam.zheng@...edance.com,
        punit.agrawal@...edance.com, simon.evans@...edance.com,
        liangma@...ngbit.com
Subject: Re: [External] Re: [PATCH v7 0/9] Parallel CPU bringup for x86_64



On 09/02/2023 03:53, Paul E. McKenney wrote:
> On Tue, Feb 07, 2023 at 11:04:27PM +0000, Usama Arif wrote:
>> Tested on v7, doing INIT/SIPI/SIPI in parallel brings down the time for
>> smpboot from ~700ms to 100ms (85% improvement) on a server with 128 CPUs
>> split across 2 NUMA nodes.
>>
>> The major change over v6 is keeping parallel smp support enabled in AMD.
>> APIC ID for parallel CPU bringup is now obtained from CPUID leaf 0x0B
>> (for x2APIC mode) otherwise CPUID leaf 0x1 (8 bits).
>>
>> The patch for reusing timer calibration for secondary CPUs is also removed
>> from the series as its not part of parallel smp bringup and needs to be
>> further thought about.
> 
> Running rcutorture on this got me the following NULL pointer dereference
> on scenario TREE01:
> 
> ------------------------------------------------------------------------
> 
> [   34.662066] smpboot: CPU 0 is now offline
> [   34.674075] rcu: NOCB: Cannot CB-offload offline CPU 25
> [   35.038003] rcu: De-offloading 5
> [   35.112997] rcu: Offloading 12
> [   35.716011] smpboot: Booting Node 0 Processor 0 APIC 0x0
> [   35.762685] BUG: kernel NULL pointer dereference, address: 0000000000000001
> [   35.764278] #PF: supervisor instruction fetch in kernel mode
> [   35.765530] #PF: error_code(0x0010) - not-present page
> [   35.766700] PGD 0 P4D 0
> [   35.767278] Oops: 0010 [#1] PREEMPT SMP PTI
> [   35.768223] CPU: 36 PID: 0 Comm: swapper/36 Not tainted 6.2.0-rc1-00206-g18a37610b632-dirty #3563
> [   35.770201] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> 
> ------------------------------------------------------------------------
> 
> Given an x86 system with KVM and qemu, this can be reproduced by running
> the following from the top-level directory in the Linux-kernel source
> tree:
> 
> 	tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --configs "TREE01 TINY01" --trust-make
> 
> Out of 15 runs, 14 blew up just after the first attempt to bring CPU
> 0 back online.  The 15th run blew up just after the second attempt to
> bring CPU 0 online, the first attempt having succeeded.
> 
> My guess is that the CONFIG_BOOTPARAM_HOTPLUG_CPU0=y Kconfig option is
> tickling this bug.  This Kconfig option has been added to the TREE01
> scenario in the -rcu tree's "dev" branch, which might mean that this test
> would pass on mainline.  But CONFIG_BOOTPARAM_HOTPLUG_CPU0=y is not new,
> only rcutorture's testing of it.
> 
> Thoughts?
> 
> 							Thanx, Paul

It looks like its because of the initial_gs, initial_stack and
early_gdt_descr not being setup properly for CPU0 hotplug, i.e. 
init_cpu_data isnt called in cpu0 hotplug case.

Its easy to test, just by doing
echo 0 > /sys/devices/system/cpu/cpu0/online;
echo 1 > /sys/devices/system/cpu/cpu0/online;

As a quick check, if we do something like below (probably there is a 
much better place to set these..), the above hotplug commands will work.

diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 3ec5182d9698..184135c47ee5 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1080,6 +1080,7 @@ wakeup_cpu_via_init_nmi(int cpu, unsigned long 
start_ip, int apicid,
                                           wakeup_cpu0_nmi, 0, "wake_cpu0");

         if (!boot_error) {
+               initial_gs = per_cpu_offset(cpu);
                 enable_start_cpu0 = 1;
                 *cpu0_nmi_registered = 1;
                 id = apic->dest_mode_logical ? cpu0_logical_apicid : 
apicid;
@@ -1188,10 +1189,14 @@ static int do_boot_cpu(int apicid, int cpu, 
struct task_struct *idle,
                 boot_error = apic->wakeup_secondary_cpu_64(apicid, 
start_ip);
         else if (apic->wakeup_secondary_cpu)
                 boot_error = apic->wakeup_secondary_cpu(apicid, start_ip);
-       else
+       else {
+               if(!cpu) {
+                       early_gdt_descr.address = (unsigned 
long)get_cpu_gdt_rw(cpu);
+                       initial_stack  = idle->thread.sp;
+               }
                 boot_error = wakeup_cpu_via_init_nmi(cpu, start_ip, apicid,
                                                      cpu0_nmi_registered);
-
+       }
         return boot_error;
  }



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ