[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f1aa2a5a-95c9-dc41-bfc9-0be12a5781c6@wdc.com>
Date: Thu, 6 Sep 2018 11:25:20 -0700
From: Atish Patra <atish.patra@....com>
To: Mark Rutland <mark.rutland@....com>
Cc: "palmer@...ive.com" <palmer@...ive.com>,
"linux-riscv@...ts.infradead.org" <linux-riscv@...ts.infradead.org>,
"hch@...radead.org" <hch@...radead.org>,
"anup@...infault.org" <anup@...infault.org>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Damien Le Moal <Damien.LeMoal@....com>,
"marc.zyngier@....com" <marc.zyngier@....com>,
"jeremy.linton@....com" <jeremy.linton@....com>,
"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
"jason@...edaemon.net" <jason@...edaemon.net>,
"catalin.marinas@....com" <catalin.marinas@....com>,
"dmitriy@...-tech.org" <dmitriy@...-tech.org>,
"ard.biesheuvel@...aro.org" <ard.biesheuvel@...aro.org>
Subject: Re: [PATCH v3 12/12] RISC-V: Support cpu hotplug.
On 9/6/18 3:21 AM, Mark Rutland wrote:
> Hi,
>
> On Thu, Sep 06, 2018 at 01:05:35AM -0700, Atish Patra wrote:
>> This patch enable support for cpu hotplug in RISC-V.
>>
>> In absence of generic cpu stop functions, WFI is used
>> to put the cpu in low power state during offline. An IPI
>> is sent to bring it out of WFI during online operation.
>
> AFAICT, this doesn't actually return the CPU to firwmare, and the WFI and
> return code are in-kernel. From experience on arm and arm64 I would *very*
> strongly advise against this kind of pseudo-hotplug, as it causes many more
> long-term problems than it solves.
>
I completely agree with you. The idea was here to have a working
cpu-hotplug solution until we have a concrete implementation of cpu
stop/start via firmware. Once we have that, we just need to replace the
wakeup and wait_for_software_interrupt calls. The current implementation
via WFI was never meant to be a long term solution.
> For instance, this causes significant pain with kexec, since the prior kernel's
> text must be kept around forever for the offline CPUs. Likewise with hibernate
> suspend/resume.
>
> Depending on how your architecture works, there can also be a number of
> problems with cache/TLB maintenance for secondary CPUs which are unexpectedly
> online.
>
> I would suggest that someone should look at writing a FW standard for CPU
> hotplug, akin to PSCI for arm/arm64. At minimum, you will want:
>
I have already started working on it in parallel. As of now, I am
planning to strengthen SBI spec by making it more flexible/extensible
with some of the below features you mentioned. But we can come up with a
new spec altogether if that makes more sense.
> - A mechanism to determine the version of FW and/or features supported by the FW.
>
> - a mechanism to hotplug-in a specific CPU to a runtime-specified address,
> ideally with some unique token (e.g. so you can pass a pointer to its stack
> or whatever).
>
> - a mechanism to hotplug-out the calling CPU.
>
> - a mechanism to determine when a specific CPU has been hotplugged out. This is
> necessary for kexec and other things to be robust.
>
Thanks for the pointers. I had not looked into 2&4 earlier. I will go
over PSCI docs again.
Regards,
Atish
> Thanks,
> Mark.
>
>>
>> Tested both on QEMU and HighFive Unleashed board with
>> 4 cpus. Test result follows.
>>
>> $ echo 0 > /sys/devices/system/cpu/cpu2/online
>> [ 31.828562] CPU2: shutdown
>> $ cat /proc/cpuinfo
>> hart : 0
>> isa : rv64imafdc
>> mmu : sv39
>> uarch : sifive,rocket0
>>
>> hart : 1
>> isa : rv64imafdc
>> mmu : sv39
>> uarch : sifive,rocket0
>>
>> hart : 3
>> isa : rv64imafdc
>> mmu : sv39
>> uarch : sifive,rocket0
>>
>> $ echo 0 > /sys/devices/system/cpu/cpu3/online
>> [ 52.968495] CPU3: shutdown
>> $ cat /proc/cpuinfo
>> hart : 0
>> isa : rv64imafdc
>> mmu : sv39
>> uarch : sifive,rocket0
>>
>> hart : 2
>> isa : rv64imafdc
>> mmu : sv39
>> uarch : sifive,rocket0
>>
>> $ echo 1 > /sys/devices/system/cpu/cpu3/online
>> [ 64.298250] CPU3: online
>> $ cat /proc/cpuinfo
>> hart : 0
>> isa : rv64imafdc
>> mmu : sv39
>> uarch : sifive,rocket0
>>
>> hart : 1
>> isa : rv64imafdc
>> mmu : sv39
>> uarch : sifive,rocket0
>>
>> hart : 3
>> isa : rv64imafdc
>> mmu : sv39
>> uarch : sifive,rocket0
>>
>> Signed-off-by: Atish Patra <atish.patra@....com>
>> ---
>> arch/riscv/Kconfig | 12 ++++++-
>> arch/riscv/include/asm/irq.h | 1 +
>> arch/riscv/include/asm/smp.h | 28 ++++++++++++++++
>> arch/riscv/kernel/Makefile | 1 +
>> arch/riscv/kernel/cpu-hotplug.c | 72 +++++++++++++++++++++++++++++++++++++++++
>> arch/riscv/kernel/head.S | 13 ++++++++
>> arch/riscv/kernel/irq.c | 24 ++++++++++++++
>> arch/riscv/kernel/setup.c | 17 +++++++++-
>> arch/riscv/kernel/smp.c | 8 +----
>> arch/riscv/kernel/smpboot.c | 7 ++--
>> arch/riscv/kernel/traps.c | 6 ++--
>> 11 files changed, 175 insertions(+), 14 deletions(-)
>> create mode 100644 arch/riscv/kernel/cpu-hotplug.c
>>
>> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
>> index a3449802..ea2e617e 100644
>> --- a/arch/riscv/Kconfig
>> +++ b/arch/riscv/Kconfig
>> @@ -21,7 +21,6 @@ config RISCV
>> select COMMON_CLK
>> select DMA_DIRECT_OPS
>> select GENERIC_CLOCKEVENTS
>> - select GENERIC_CPU_DEVICES
>> select GENERIC_IRQ_SHOW
>> select GENERIC_PCI_IOMAP
>> select GENERIC_STRNCPY_FROM_USER
>> @@ -167,6 +166,17 @@ config SMP
>>
>> If you don't know what to do here, say N.
>>
>> +config HOTPLUG_CPU
>> + bool "Support for hot-pluggable CPUs"
>> + depends on SMP
>> + select GENERIC_IRQ_MIGRATION
>> + help
>> +
>> + Say Y here to experiment with turning CPUs off and on. CPUs
>> + can be controlled through /sys/devices/system/cpu.
>> +
>> + Say N if you want to disable CPU hotplug.
>> +
>> config NR_CPUS
>> int "Maximum number of CPUs (2-32)"
>> range 2 32
>> diff --git a/arch/riscv/include/asm/irq.h b/arch/riscv/include/asm/irq.h
>> index 996b6fbe..a873a72d 100644
>> --- a/arch/riscv/include/asm/irq.h
>> +++ b/arch/riscv/include/asm/irq.h
>> @@ -19,6 +19,7 @@
>>
>> void riscv_timer_interrupt(void);
>> void riscv_software_interrupt(void);
>> +void wait_for_software_interrupt(void);
>>
>> #include <asm-generic/irq.h>
>>
>> diff --git a/arch/riscv/include/asm/smp.h b/arch/riscv/include/asm/smp.h
>> index fce312ce..8145b865 100644
>> --- a/arch/riscv/include/asm/smp.h
>> +++ b/arch/riscv/include/asm/smp.h
>> @@ -25,8 +25,29 @@
>> extern unsigned long __cpuid_to_hardid_map[NR_CPUS];
>> #define cpuid_to_hardid_map(cpu) __cpuid_to_hardid_map[cpu]
>>
>> +#if defined CONFIG_SMP && defined CONFIG_HOTPLUG_CPU
>> +void arch_send_call_wakeup_ipi(int cpu);
>> +bool can_hotplug_cpu(void);
>> +#else
>> +static inline bool can_hotplug_cpu(void)
>> +{
>> + return 0;
>> +}
>> +static inline void arch_send_call_wakeup_ipi(int cpu) { }
>> +#endif
>> +
>> #ifdef CONFIG_SMP
>>
>> +enum ipi_message_type {
>> + IPI_RESCHEDULE,
>> + IPI_CALL_FUNC,
>> + IPI_CALL_WAKEUP,
>> + IPI_MAX
>> +};
>> +
>> +void send_ipi_message(const struct cpumask *to_whom,
>> + enum ipi_message_type operation);
>> +
>> /* SMP initialization hook for setup_arch */
>> void __init setup_smp(void);
>>
>> @@ -45,6 +66,13 @@ void riscv_cpuid_to_hartid_mask(const struct cpumask *in, struct cpumask *out);
>> */
>> #define raw_smp_processor_id() (current_thread_info()->cpu)
>>
>> +#ifdef CONFIG_HOTPLUG_CPU
>> +int __cpu_disable(void);
>> +void __cpu_die(unsigned int cpu);
>> +void cpu_play_dead(void);
>> +void boot_sec_cpu(void);
>> +#endif /* CONFIG_HOTPLUG_CPU */
>> +
>> #else
>>
>> static inline int riscv_hartid_to_cpuid(int hartid)
>> diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile
>> index e1274fc0..62043204 100644
>> --- a/arch/riscv/kernel/Makefile
>> +++ b/arch/riscv/kernel/Makefile
>> @@ -35,6 +35,7 @@ obj-$(CONFIG_SMP) += smpboot.o
>> obj-$(CONFIG_SMP) += smp.o
>> obj-$(CONFIG_MODULES) += module.o
>> obj-$(CONFIG_MODULE_SECTIONS) += module-sections.o
>> +obj-$(CONFIG_HOTPLUG_CPU) += cpu-hotplug.o
>>
>> obj-$(CONFIG_FUNCTION_TRACER) += mcount.o ftrace.o
>> obj-$(CONFIG_DYNAMIC_FTRACE) += mcount-dyn.o
>> diff --git a/arch/riscv/kernel/cpu-hotplug.c b/arch/riscv/kernel/cpu-hotplug.c
>> new file mode 100644
>> index 00000000..7a152972
>> --- /dev/null
>> +++ b/arch/riscv/kernel/cpu-hotplug.c
>> @@ -0,0 +1,72 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * Copyright (C) 2018 Western Digital Corporation or its affiliates.
>> + */
>> +
>> +#include <linux/kernel.h>
>> +#include <linux/mm.h>
>> +#include <linux/sched.h>
>> +#include <linux/err.h>
>> +#include <linux/irq.h>
>> +#include <linux/cpu.h>
>> +#include <linux/sched/hotplug.h>
>> +#include <asm/irq.h>
>> +#include <asm/sbi.h>
>> +
>> +bool can_hotplug_cpu(void)
>> +{
>> + return true;
>> +}
>> +
>> +void arch_cpu_idle_dead(void)
>> +{
>> + cpu_play_dead();
>> +}
>> +
>> +/*
>> + * __cpu_disable runs on the processor to be shutdown.
>> + */
>> +int __cpu_disable(void)
>> +{
>> + int ret = 0;
>> + unsigned int cpu = smp_processor_id();
>> +
>> + set_cpu_online(cpu, false);
>> + irq_migrate_all_off_this_cpu();
>> +
>> + return ret;
>> +}
>> +/*
>> + * called on the thread which is asking for a CPU to be shutdown -
>> + * waits until shutdown has completed, or it is timed out.
>> + */
>> +void __cpu_die(unsigned int cpu)
>> +{
>> + if (!cpu_wait_death(cpu, 5)) {
>> + pr_err("CPU %u: didn't die\n", cpu);
>> + return;
>> + }
>> + pr_notice("CPU%u: shutdown\n", cpu);
>> + /*TODO: Do we need to verify is cpu is really dead */
>> +}
>> +
>> +/*
>> + * Called from the idle thread for the CPU which has been shutdown.
>> + *
>> + */
>> +void cpu_play_dead(void)
>> +{
>> + idle_task_exit();
>> +
>> + (void)cpu_report_death();
>> +
>> + /* Do not disable software interrupt to restart cpu after WFI */
>> + csr_clear(sie, SIE_STIE | SIE_SEIE);
>> + wait_for_software_interrupt();
>> + boot_sec_cpu();
>> +}
>> +
>> +void arch_send_call_wakeup_ipi(int cpu)
>> +{
>> + send_ipi_message(cpumask_of(cpu), IPI_CALL_WAKEUP);
>> +}
>> diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
>> index 711190d4..07d9fb38 100644
>> --- a/arch/riscv/kernel/head.S
>> +++ b/arch/riscv/kernel/head.S
>> @@ -153,6 +153,19 @@ relocate:
>> j .Lsecondary_park
>> END(_start)
>>
>> +#ifdef CONFIG_SMP
>> +.section .text
>> +.global boot_sec_cpu
>> +
>> +boot_sec_cpu:
>> + /* clear all pending flags */
>> + csrw sip, zero
>> + /* Mask all interrupts */
>> + csrw sie, zero
>> + fence
>> +
>> + tail smp_callin
>> +#endif
>> __PAGE_ALIGNED_BSS
>> /* Empty zero page */
>> .balign PAGE_SIZE
>> diff --git a/arch/riscv/kernel/irq.c b/arch/riscv/kernel/irq.c
>> index 0cfac48a..7e14b0d9 100644
>> --- a/arch/riscv/kernel/irq.c
>> +++ b/arch/riscv/kernel/irq.c
>> @@ -53,6 +53,30 @@ asmlinkage void __irq_entry do_IRQ(struct pt_regs *regs, unsigned long cause)
>> set_irq_regs(old_regs);
>> }
>>
>> +/*
>> + * This function doesn't return until a software interrupt is sent via IPI.
>> + * Obviously, all the interrupts except software interrupt should be disabled
>> + * before this function is called.
>> + */
>> +void wait_for_software_interrupt(void)
>> +{
>> + unsigned long sipval, sieval, scauseval;
>> +
>> + /* clear all pending flags */
>> + csr_write(sip, 0);
>> + /* clear any previous scause data */
>> + csr_write(scause, 0);
>> +
>> + do {
>> + wait_for_interrupt();
>> + sipval = csr_read(sip);
>> + sieval = csr_read(sie);
>> + scauseval = csr_read(scause) & ~INTERRUPT_CAUSE_FLAG;
>> + /* only break if wfi returns for an enabled interrupt */
>> + } while ((sipval & sieval) == 0 &&
>> + scauseval != INTERRUPT_CAUSE_SOFTWARE);
>> +}
>> +
>> void __init init_IRQ(void)
>> {
>> irqchip_init();
>> diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
>> index a5fac1b7..612f0c21 100644
>> --- a/arch/riscv/kernel/setup.c
>> +++ b/arch/riscv/kernel/setup.c
>> @@ -30,11 +30,11 @@
>> #include <linux/of_platform.h>
>> #include <linux/sched/task.h>
>> #include <linux/swiotlb.h>
>> +#include <linux/smp.h>
>>
>> #include <asm/setup.h>
>> #include <asm/sections.h>
>> #include <asm/pgtable.h>
>> -#include <asm/smp.h>
>> #include <asm/sbi.h>
>> #include <asm/tlbflush.h>
>> #include <asm/thread_info.h>
>> @@ -82,6 +82,7 @@ EXPORT_SYMBOL(empty_zero_page);
>> /* The lucky hart to first increment this variable will boot the other cores */
>> atomic_t hart_lottery;
>> unsigned long boot_cpu_hartid;
>> +static DEFINE_PER_CPU(struct cpu, cpu_devices);
>>
>> unsigned long __cpuid_to_hardid_map[NR_CPUS] = {
>> [0 ... NR_CPUS-1] = INVALID_HARTID
>> @@ -257,3 +258,17 @@ void __init setup_arch(char **cmdline_p)
>> riscv_fill_hwcap();
>> }
>>
>> +static int __init topology_init(void)
>> +{
>> + int i;
>> +
>> + for_each_possible_cpu(i) {
>> + struct cpu *cpu = &per_cpu(cpu_devices, i);
>> +
>> + cpu->hotpluggable = can_hotplug_cpu();
>> + register_cpu(cpu, i);
>> + }
>> +
>> + return 0;
>> +}
>> +subsys_initcall(topology_init);
>> diff --git a/arch/riscv/kernel/smp.c b/arch/riscv/kernel/smp.c
>> index 89d95866..629456bb 100644
>> --- a/arch/riscv/kernel/smp.c
>> +++ b/arch/riscv/kernel/smp.c
>> @@ -32,12 +32,6 @@ static struct {
>> unsigned long bits ____cacheline_aligned;
>> } ipi_data[NR_CPUS] __cacheline_aligned;
>>
>> -enum ipi_message_type {
>> - IPI_RESCHEDULE,
>> - IPI_CALL_FUNC,
>> - IPI_MAX
>> -};
>> -
>> int riscv_hartid_to_cpuid(int hartid)
>> {
>> int i = -1;
>> @@ -94,7 +88,7 @@ void riscv_software_interrupt(void)
>> }
>> }
>>
>> -static void
>> +void
>> send_ipi_message(const struct cpumask *to_whom, enum ipi_message_type operation)
>> {
>> int cpuid, hartid;
>> diff --git a/arch/riscv/kernel/smpboot.c b/arch/riscv/kernel/smpboot.c
>> index f44ae780..a9138431 100644
>> --- a/arch/riscv/kernel/smpboot.c
>> +++ b/arch/riscv/kernel/smpboot.c
>> @@ -92,9 +92,12 @@ int __cpu_up(unsigned int cpu, struct task_struct *tidle)
>> task_stack_page(tidle) + THREAD_SIZE);
>> WRITE_ONCE(__cpu_up_task_pointer[hartid], tidle);
>>
>> + arch_send_call_wakeup_ipi(cpu);
>> while (!cpu_online(cpu))
>> cpu_relax();
>>
>> + pr_notice("CPU%u: online\n", cpu);
>> +
>> return 0;
>> }
>>
>> @@ -105,7 +108,7 @@ void __init smp_cpus_done(unsigned int max_cpus)
>> /*
>> * C entry point for a secondary processor.
>> */
>> -asmlinkage void __init smp_callin(void)
>> +asmlinkage void smp_callin(void)
>> {
>> struct mm_struct *mm = &init_mm;
>>
>> @@ -115,7 +118,7 @@ asmlinkage void __init smp_callin(void)
>>
>> trap_init();
>> notify_cpu_starting(smp_processor_id());
>> - set_cpu_online(smp_processor_id(), 1);
>> + set_cpu_online(smp_processor_id(), true);
>> /*
>> * Remote TLB flushes are ignored while the CPU is offline, so emit
>> * a local TLB flush right now just in case.
>> diff --git a/arch/riscv/kernel/traps.c b/arch/riscv/kernel/traps.c
>> index 24a9333d..8b331619 100644
>> --- a/arch/riscv/kernel/traps.c
>> +++ b/arch/riscv/kernel/traps.c
>> @@ -153,7 +153,7 @@ int is_valid_bugaddr(unsigned long pc)
>> }
>> #endif /* CONFIG_GENERIC_BUG */
>>
>> -void __init trap_init(void)
>> +void trap_init(void)
>> {
>> /*
>> * Set sup0 scratch register to 0, indicating to exception vector
>> @@ -162,6 +162,6 @@ void __init trap_init(void)
>> csr_write(sscratch, 0);
>> /* Set the exception vector address */
>> csr_write(stvec, &handle_exception);
>> - /* Enable all interrupts */
>> - csr_write(sie, -1);
>> + /* Enable all interrupts but timer interrupt*/
>> + csr_set(sie, SIE_SSIE | SIE_SEIE);
>> }
>> --
>> 2.7.4
>>
>
Powered by blists - more mailing lists