lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 6 Mar 2014 15:10:32 +0100 From: Igor Mammedov <imammedo@...hat.com> To: linux-kernel@...r.kernel.org Cc: tglx@...utronix.de, mingo@...hat.com, hpa@...or.com, x86@...nel.org, seiji.aguchi@....com, bp@...e.de, imammedo@...hat.com, paul.gortmaker@...driver.com, peterz@...radead.org, JBeulich@...e.com, kirill.shutemov@...ux.intel.com, prarit@...hat.com, toshi.kani@...com, drjones@...hat.com Subject: [PATCH v2] abort secondary CPU bring-up gracefully if do_boot_cpu timed out on cpu_callin_mask Hang is observed on virtual machines during CPU hotplug, especially in big guests with many CPUs. (It happens more often if host is over-committed). Patch is present in RHEL6 since 2012 and it fixes issue there, it also fixes issue in upstream kernel according to tests. Below is detailed description of the problem: ----- Master CPU may timeout before cpu_callin_mask is set and cancel booting CPU, but being onlined CPU still continues to boot, sets cpu_active_mask (CPU_STARTING notifiers) and spins in check_tsc_sync_target() for master cpu to arrive. Following attempt to online another cpu hangs in stop_machine, initiated from here: smp_callin -> smp_store_cpu_info -> identify_secondary_cpu -> mtrr_ap_init -> set_mtrr_from_inactive_cpu stop_machine waits on completion of stop_work on all CPUs from cpu_active_mask including a failed CPU that spins in check_tsc_sync_target(). Issue is fixed if being onlined CPU continues to boot and calls notify_cpu_starting(cpuid) only when master CPU waits for it to come online. If master CPU times out on cpu_callin_mask and goes via cancel path, the being onlined CPU should gracefully shutdown itself. Patch introduces cpu_may_complete_boot_mask to notify a being onlined CPU that it may call notify_cpu_starting(cpuid) and continue to boot when master CPU goes via normal boot path and going to wait till the being onlined CPU completes its initialization. - normal boot sequence will look like: master CPU1 being onlined CPU2 * wait for CPU2 in cpu_callin_mask --------------------------------------------------------------------- * set CPU2 in cpu_callin_mask * wait till CPU1 set CPU2 bit in cpu_may_complete_boot_mask --------------------------------------------------------------------- * set CPU2 bit in cpu_may_complete_boot_mask * return from do_boot_cpu() and wait in - check_tsc_sync_source() or - while (!cpu_online(CPU2)) --------------------------------------------------------------------- * call notify_cpu_starting() and continue CPU2 initialization * mark itself as ONLINE --------------------------------------------------------------------- * return to _cpu_up and call cpu_notify(CPU_ONLINE, ...) - cancel/error path will look like: master CPU1 being onlined CPU2 * time out on cpu_callin_mask --------------------------------------------------------------------- * set CPU2 in cpu_callin_mask * wait till CPU2 is set in cpu_may_complete_boot_mask or cleared in cpu_callout_mask --------------------------------------------------------------------- * clear CPU2 in cpu_callout_mask and return with error --------------------------------------------------------------------- * do cleanups and play_dead() Signed-off-by: Igor Mammedov <imammedo@...hat.com> --- v2: * updated commit message with description of which systems are affected but bug and under which conditions. --- arch/x86/include/asm/cpumask.h | 1 + arch/x86/kernel/cpu/common.c | 2 ++ arch/x86/kernel/smpboot.c | 37 +++++++++++++++++++++++++++++++++++-- 3 files changed, 38 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/cpumask.h b/arch/x86/include/asm/cpumask.h index 61c852f..eacd269 100644 --- a/arch/x86/include/asm/cpumask.h +++ b/arch/x86/include/asm/cpumask.h @@ -7,6 +7,7 @@ extern cpumask_var_t cpu_callin_mask; extern cpumask_var_t cpu_callout_mask; extern cpumask_var_t cpu_initialized_mask; extern cpumask_var_t cpu_sibling_setup_mask; +extern cpumask_var_t cpu_may_complete_boot_mask; extern void setup_cpu_local_masks(void); diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 8e28bf2..4797111 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -50,6 +50,7 @@ cpumask_var_t cpu_initialized_mask; cpumask_var_t cpu_callout_mask; cpumask_var_t cpu_callin_mask; +cpumask_var_t cpu_may_complete_boot_mask; /* representing cpus for which sibling maps can be computed */ cpumask_var_t cpu_sibling_setup_mask; @@ -61,6 +62,7 @@ void __init setup_cpu_local_masks(void) alloc_bootmem_cpumask_var(&cpu_callin_mask); alloc_bootmem_cpumask_var(&cpu_callout_mask); alloc_bootmem_cpumask_var(&cpu_sibling_setup_mask); + alloc_bootmem_cpumask_var(&cpu_may_complete_boot_mask); } static void default_init(struct cpuinfo_x86 *c) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index a32da80..4e9a63b 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -104,6 +104,8 @@ EXPORT_PER_CPU_SYMBOL(cpu_info); atomic_t init_deasserted; +static void remove_siblinginfo(int cpu); + /* * Report back to the Boot Processor during boot time or to the caller processor * during CPU online. @@ -202,12 +204,38 @@ static void smp_callin(void) set_cpu_sibling_map(raw_smp_processor_id()); wmb(); - notify_cpu_starting(cpuid); - /* * Allow the master to continue. */ cpumask_set_cpu(cpuid, cpu_callin_mask); + + /* + * Wait for signal from master CPU to continue or abort. + */ + while (!cpumask_test_cpu(cpuid, cpu_may_complete_boot_mask) && + cpumask_test_cpu(cpuid, cpu_callout_mask)) { + cpu_relax(); + } + + /* die if master cancelled cpu_up */ + if (!cpumask_test_cpu(cpuid, cpu_may_complete_boot_mask)) + goto die; + + notify_cpu_starting(cpuid); + return; + +die: +#ifdef CONFIG_HOTPLUG_CPU + /* was set by smp_store_cpu_info->...->numa_add_cpu */ + numa_remove_cpu(cpuid); + remove_siblinginfo(cpuid); + clear_local_APIC(); + /* was set by cpu_init() */ + cpumask_clear_cpu(cpuid, cpu_initialized_mask); + cpumask_clear_cpu(cpuid, cpu_callin_mask); + play_dead(); +#endif + panic("%s: Failed to online CPU%d!\n", __func__, cpuid); } static int cpu0_logical_apicid; @@ -827,6 +855,8 @@ static int do_boot_cpu(int apicid, int cpu, struct task_struct *idle) } if (cpumask_test_cpu(cpu, cpu_callin_mask)) { + /* Signal AP that it may continue to boot */ + cpumask_set_cpu(cpu, cpu_may_complete_boot_mask); print_cpu_msr(&cpu_data(cpu)); pr_debug("CPU%d: has booted.\n", cpu); } else { @@ -1085,6 +1115,7 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus) */ smp_store_boot_cpu_info(); /* Final full version of the data */ cpumask_copy(cpu_callin_mask, cpumask_of(0)); + cpumask_copy(cpu_may_complete_boot_mask, cpumask_of(0)); mb(); current_thread_info()->cpu = 0; /* needed? */ @@ -1294,6 +1325,8 @@ static void __ref remove_cpu_from_maps(int cpu) cpumask_clear_cpu(cpu, cpu_callin_mask); /* was set by cpu_init() */ cpumask_clear_cpu(cpu, cpu_initialized_mask); + /* set by do_boot_cpu() */ + cpumask_clear_cpu(cpu, cpu_may_complete_boot_mask); numa_remove_cpu(cpu); } -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists