lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c795e967-b69d-33fd-b333-b0f8e0636b5b@bytedance.com>
Date:   Mon, 27 Feb 2023 06:25:17 +0000
From:   Usama Arif <usama.arif@...edance.com>
To:     David Woodhouse <dwmw2@...radead.org>,
        Oleksandr Natalenko <oleksandr@...alenko.name>,
        tglx@...utronix.de, kim.phillips@....com, brgerst@...il.com
Cc:     piotrgorski@...hyos.org, arjan@...ux.intel.com, mingo@...hat.com,
        bp@...en8.de, dave.hansen@...ux.intel.com, hpa@...or.com,
        x86@...nel.org, pbonzini@...hat.com, paulmck@...nel.org,
        linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
        rcu@...r.kernel.org, mimoja@...oja.de, hewenliang4@...wei.com,
        thomas.lendacky@....com, seanjc@...gle.com, pmenzel@...gen.mpg.de,
        fam.zheng@...edance.com, punit.agrawal@...edance.com,
        simon.evans@...edance.com, liangma@...ngbit.com
Subject: Re: [External] Re: [PATCH v12 00/11] Parallel CPU bringup for x86_64



On 27/02/2023 06:13, David Woodhouse wrote:
> 
> 
> On 26 February 2023 20:59:17 GMT, Usama Arif <usama.arif@...edance.com> wrote:
>>
>>
>> On 26/02/2023 18:31, Oleksandr Natalenko wrote:
>>> Hello.
>>>
>>> On neděle 26. února 2023 12:07:51 CET Usama Arif wrote:
>>>> The main code change over v11 is the build error fix by Brian Gerst and
>>>> acquiring tr_lock in trampoline_64.S whenever the stack is setup.
>>>>
>>>> The git history is also rewritten to move the commits that removed
>>>> initial_stack, early_gdt_descr and initial_gs earlier in the patchset.
>>>>
>>>> Thanks,
>>>> Usama
>>>>
>>>> Changes across versions:
>>>> v2: Cut it back to just INIT/SIPI/SIPI in parallel for now, nothing more
>>>> v3: Clean up x2apic patch, add MTRR optimisation, lock topology update
>>>>       in preparation for more parallelisation.
>>>> v4: Fixes to the real mode parallelisation patch spotted by SeanC, to
>>>>       avoid scribbling on initial_gs in common_cpu_up(), and to allow all
>>>>       24 bits of the physical X2APIC ID to be used. That patch still needs
>>>>       a Signed-off-by from its original author, who once claimed not to
>>>>       remember writing it at all. But now we've fixed it, hopefully he'll
>>>>       admit it now :)
>>>> v5: rebase to v6.1 and remeasure performance, disable parallel bringup
>>>>       for AMD CPUs.
>>>> v6: rebase to v6.2-rc6, disabled parallel boot on amd as a cpu bug and
>>>>       reused timer calibration for secondary CPUs.
>>>> v7: [David Woodhouse] iterate over all possible CPUs to find any existing
>>>>       cluster mask in alloc_clustermask. (patch 1/9)
>>>>       Keep parallel AMD support enabled in AMD, using APIC ID in CPUID leaf
>>>>       0x0B (for x2APIC mode) or CPUID leaf 0x01 where 8 bits are sufficient.
>>>>       Included sanity checks for APIC id from 0x0B. (patch 6/9)
>>>>       Removed patch for reusing timer calibration for secondary CPUs.
>>>>       commit message and code improvements.
>>>> v8: Fix CPU0 hotplug by setting up the initial_gs, initial_stack and
>>>>       early_gdt_descr.
>>>>       Drop trampoline lock and bail if APIC ID not found in find_cpunr.
>>>>       Code comments improved and debug prints added.
>>>> v9: Drop patch to avoid repeated saves of MTRR at boot time.
>>>>       rebased and retested at v6.2-rc8.
>>>>       added kernel doc for no_parallel_bringup and made do_parallel_bringup
>>>>       __ro_after_init.
>>>> v10: Fixed suspend/resume not working with parallel smpboot.
>>>>        rebased and retested to 6.2.
>>>>        fixed checkpatch errors.
>>>> v11: Added patches from Brian Gerst to remove the global variables initial_gs,
>>>>        initial_stack, and early_gdt_descr from the 64-bit boot code
>>>>        (https://lore.kernel.org/all/20230222221301.245890-1-brgerst@gmail.com/).
>>>> v12: Fixed compilation errors, acquire tr_lock for every stack setup in
>>>>        trampoline_64.S.
>>>>        Rearranged commits for a cleaner git history.
>>>>
>>>> Brian Gerst (3):
>>>>     x86/smpboot: Remove initial_stack on 64-bit
>>>>     x86/smpboot: Remove early_gdt_descr on 64-bit
>>>>     x86/smpboot: Remove initial_gs
>>>>
>>>> David Woodhouse (8):
>>>>     x86/apic/x2apic: Allow CPU cluster_mask to be populated in parallel
>>>>     cpu/hotplug: Move idle_thread_get() to <linux/smpboot.h>
>>>>     cpu/hotplug: Add dynamic parallel bringup states before
>>>>       CPUHP_BRINGUP_CPU
>>>>     x86/smpboot: Reference count on smpboot_setup_warm_reset_vector()
>>>>     x86/smpboot: Split up native_cpu_up into separate phases and document
>>>>       them
>>>>     x86/smpboot: Support parallel startup of secondary CPUs
>>>>     x86/smpboot: Send INIT/SIPI/SIPI to secondary CPUs in parallel
>>>>     x86/smpboot: Serialize topology updates for secondary bringup
>>>>
>>>>    .../admin-guide/kernel-parameters.txt         |   3 +
>>>>    arch/x86/include/asm/processor.h              |   6 +-
>>>>    arch/x86/include/asm/realmode.h               |   4 +-
>>>>    arch/x86/include/asm/smp.h                    |  15 +-
>>>>    arch/x86/include/asm/topology.h               |   2 -
>>>>    arch/x86/kernel/acpi/sleep.c                  |  15 +-
>>>>    arch/x86/kernel/apic/apic.c                   |   2 +-
>>>>    arch/x86/kernel/apic/x2apic_cluster.c         | 126 ++++---
>>>>    arch/x86/kernel/asm-offsets.c                 |   1 +
>>>>    arch/x86/kernel/cpu/common.c                  |   6 +-
>>>>    arch/x86/kernel/head_64.S                     | 129 +++++--
>>>>    arch/x86/kernel/smpboot.c                     | 350 +++++++++++++-----
>>>>    arch/x86/realmode/init.c                      |   3 +
>>>>    arch/x86/realmode/rm/trampoline_64.S          |  27 +-
>>>>    arch/x86/xen/smp_pv.c                         |   4 +-
>>>>    arch/x86/xen/xen-head.S                       |   2 +-
>>>>    include/linux/cpuhotplug.h                    |   2 +
>>>>    include/linux/smpboot.h                       |   7 +
>>>>    kernel/cpu.c                                  |  31 +-
>>>>    kernel/smpboot.h                              |   2 -
>>>>    20 files changed, 537 insertions(+), 200 deletions(-)
>>>
>>> With `CONFIG_FORCE_NR_CPUS=y` this results in:
>>>
>>> ```
>>> ld: vmlinux.o: in function `secondary_startup_64_no_verify':
>>> (.head.text+0x10c): undefined reference to `nr_cpu_ids'
>>> ```
>>>
>>> That's because in `arch/x86/kernel/head_64.S` `secondary_startup_64_no_verify()` refers to `nr_cpu_ids` under `#ifdef CONFIG_SMP`, but this symbol is available under the following conditions:
>>>
>>> ```
>>> 38 #if (NR_CPUS == 1) || defined(CONFIG_FORCE_NR_CPUS)
>>> 39 #define nr_cpu_ids ((unsigned int)NR_CPUS)
>>> 40 #else
>>> 41 extern unsigned int nr_cpu_ids;
>>> 42 #endif
>>>
>>> 1090 #if (NR_CPUS > 1) && !defined(CONFIG_FORCE_NR_CPUS)
>>> 1091 /* Setup number of possible processor ids */
>>> 1092 unsigned int nr_cpu_ids __read_mostly = NR_CPUS;
>>> 1093 EXPORT_SYMBOL(nr_cpu_ids);
>>> 1094 #endif
>>> ```
>>>
>>> So having `CONFIG_SMP=y` and, for instance, `CONFIG_NR_CPUS=320`, it is possible to compile out `EXPORT_SYMBOL(nr_cpu_ids);` if `CONFIG_FORCE_NR_CPUS=y` is set.
>>>
>>
>> I think something like below diff should work in all scenarios?
> 
> I'd've changed the asm side to use the constant limit.

Yup, just needed the morning coffee :) Had sent the proper fix in 
https://lore.kernel.org/all/5e8ad90a-1dc6-95c2-e020-5e95da6f9eda@bytedance.com/#t

I guess the diff is still small over v12 (including the cosmetic 
changes) to send out a new version so soon, probably better to wait a 
couple of days incase something else comes up as well?

Thanks,
Usama

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ