linux-kernel - Re: [PATCH v2] RISC-V: Increase range and default value of NR

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <b907f6c8-968b-7e1e-bc83-1d54c7e0b448@canonical.com>
Date:   Fri, 8 Apr 2022 18:38:28 +0200
From:   Heinrich Schuchardt <heinrich.schuchardt@...onical.com>
To:     Anup Patel <apatel@...tanamicro.com>
Cc:     Palmer Dabbelt <palmer@...belt.com>,
        Paul Walmsley <paul.walmsley@...ive.com>,
        Arnd Bergmann <arnd@...db.de>,
        Atish Patra <atishp@...shpatra.org>,
        Alistair Francis <Alistair.Francis@....com>,
        Anup Patel <anup@...infault.org>,
        linux-riscv <linux-riscv@...ts.infradead.org>,
        "linux-kernel@...r.kernel.org List" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] RISC-V: Increase range and default value of NR_CPUS

On 4/6/22 12:10, Anup Patel wrote:
> On Wed, Apr 6, 2022 at 3:25 PM Heinrich Schuchardt
> <heinrich.schuchardt@...onical.com> wrote:
>>
>> On 3/31/22 21:42, Palmer Dabbelt wrote:
>>> On Sat, 19 Mar 2022 05:12:06 PDT (-0700), apatel@...tanamicro.com wrote:
>>>> Currently, the range and default value of NR_CPUS is too restrictive
>>>> for high-end RISC-V systems with large number of HARTs. The latest
>>>> QEMU virt machine supports upto 512 CPUs so the current NR_CPUS is
>>>> restrictive for QEMU as well. Other major architectures (such as
>>>> ARM64, x86_64, MIPS, etc) have a much higher range and default
>>>> value of NR_CPUS.
>>>>
>>>> This patch increases NR_CPUS range to 2-512 and default value to
>>>> XLEN (i.e. 32 for RV32 and 64 for RV64).
>>>>
>>>> Signed-off-by: Anup Patel <apatel@...tanamicro.com>
>>>> ---
>>>> Changes since v1:
>>>>   - Updated NR_CPUS range to 2-512 which reflects maximum number of
>>>>     CPUs supported by QEMU virt machine.
>>>> ---
>>>>   arch/riscv/Kconfig | 7 ++++---
>>>>   1 file changed, 4 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
>>>> index 5adcbd9b5e88..423ac17f598c 100644
>>>> --- a/arch/riscv/Kconfig
>>>> +++ b/arch/riscv/Kconfig
>>>> @@ -274,10 +274,11 @@ config SMP
>>>>         If you don't know what to do here, say N.
>>>>
>>>>   config NR_CPUS
>>>> -    int "Maximum number of CPUs (2-32)"
>>>> -    range 2 32
>>>> +    int "Maximum number of CPUs (2-512)"
>>>> +    range 2 512
>>
>> For SBI_V01=y there seems to be a hard constraint to XLEN bits.
>> See __sbi_v01_cpumask_to_hartmask() in rch/riscv/kernel/sbi.c.
>>
>> So shouldn't this be something like:
>>
>> range 2 512 !SBI_V01
>> range 2 32 SBI_V01 && 32BIT
>> range 2 64 SBI_V01 && 64BIT
> 
> This is just making it unnecessarily complicated for supporting
> SBI v0.1
> 
> How about removing SBI v0.1 support and the spin-wait CPU
> operations from arch/riscv ?

The SBI v0.1 specification was only a draft. Only the v1.0 version has 
ever been ratified.

It would be good to remove this legacy code from Linux and U-Boot.

By the way, why does upstream OpenSBI claim to be conformant to SBI v0.3 
and not to v1.0?

include/sbi/sbi_ecall.h:16:

#define SBI_ECALL_VERSION_MAJOR 0
#define SBI_ECALL_VERSION_MINOR 3

Best regards

Heinrich

> 
>>
>>>>       depends on SMP
>>>> -    default "8"
>>>> +    default "32" if 32BIT
>>>> +    default "64" if 64BIT
>>>>
>>>>   config HOTPLUG_CPU
>>>>       bool "Support for hot-pluggable CPUs"
>>>
>>> I'm getting all sorts of boot issues with more than 32 CPUs, even on the
>>> latest QEMU master.  I'm not opposed to increasing the CPU count in
>>> theory, but if we're going to have a setting that goes up to a huge
>>> number it needs to at least boot.  I've got 64 host threads, so it
>>> shouldn't just be a scheduling thing.
>>
>> Currently high performing hardware for RISC-V is missing. So it makes
>> sense to build software via QEMU on x86_64 or arm64 with as many
>> hardware threads as available (128 is not uncommon).
>>
>> OpenSBI currently is limited to 128 threads:
>> include/sbi/sbi_hartmask.h:22:
>> #define SBI_HARTMASK_MAX_BITS 128
>> This is just an arbitrary value we can be modified.
> 
> Yes, this limit will be gradually increased with some improvements
> to optimize runtime memory used by OpenSBI.
> 
>>
>> U-Boot v2022.04 qemu-riscv64_smode_defconfig has a value of
>> CONFIG_SYS_MALLOC_F_LEN that is to low. This leads to a boot failure for
>> more than 16 harts. A patch to correct this is pending:
>> [PATCH v2 1/1] riscv: alloc space exhausted
>> https://lore.kernel.org/u-boot/CAN5B=eKt=tFLZ2z3aNHJqsnJzpdA0oikcrC2i1_=ZDD=f+M0jA@mail.gmail.com/T/#t
>>
>> With QEMU 7.0 and the U-Boot fix booting into a 5.17 defconfig kernel
>> with 64 virtual cores worked fine for me.
> 
> Thanks for trying this patch.
> 
> Regards,
> Anup
> 
>>
>> Best regards
>>
>> Heinrich
>>
>>>
>>> If there was some hardware that actually boots on these I'd be happy to
>>> take it, but given that it's just QEMU I'd prefer to sort out the bugs
>>> first.  It's probably just latent bugs somewhere, but allowing users to
>>> turn on configs we know don't work just seems like the wrong way to go.
>>>