[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <61173b04-faea-4dfe-8e82-95a55ee33f3f@ghiti.fr>
Date: Fri, 28 Mar 2025 15:07:36 +0100
From: Alexandre Ghiti <alex@...ti.fr>
To: Kuan-Wei Chiu <visitorckw@...il.com>, paul.walmsley@...ive.com,
palmer@...belt.com, aou@...s.berkeley.edu
Cc: jserv@...s.ncku.edu.tw, eleanor15x@...il.com,
linux-riscv@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH] riscv: Optimize gcd() performance by selecting
CPU_NO_EFFICIENT_FFS
Hi Kuan-Wei,
First sorry for the late review.
On 17/02/2025 02:37, Kuan-Wei Chiu wrote:
> When the Zbb extension is not supported, ffs() falls back to a software
> implementation instead of leveraging the hardware ctz instruction for
> fast computation. In such cases, selecting CPU_NO_EFFICIENT_FFS
> optimizes the efficiency of gcd().
>
> The implementation of gcd() depends on the CPU_NO_EFFICIENT_FFS option.
> With hardware support for ffs, the binary GCD algorithm is used.
> Without it, the odd-even GCD algorithm is employed for better
> performance.
>
> Co-developed-by: Yu-Chun Lin <eleanor15x@...il.com>
> Signed-off-by: Yu-Chun Lin <eleanor15x@...il.com>
> Signed-off-by: Kuan-Wei Chiu <visitorckw@...il.com>
> ---
> Although selecting NO_EFFICIENT_FFS seems reasonable without ctz
> instructions, this patch hasn't been tested on real hardware. We'd
> greatly appreciate it if someone could help test and provide
> performance numbers!
>
> arch/riscv/Kconfig | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 7612c52e9b1e..2dd3699ad09b 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -91,6 +91,7 @@ config RISCV
> select CLINT_TIMER if RISCV_M_MODE
> select CLONE_BACKWARDS
> select COMMON_CLK
> + select CPU_NO_EFFICIENT_FFS if !RISCV_ISA_ZBB
> select CPU_PM if CPU_IDLE || HIBERNATION || SUSPEND
> select EDAC_SUPPORT
> select FRAME_POINTER if PERF_EVENTS || (FUNCTION_TRACER && !DYNAMIC_FTRACE)
So your patch is correct. But a kernel built with RISCV_ISA_ZBB does not
mean the platform supports zbb and in that case, we'd still use the slow
version of gcd().
Then I would use static keys instead, can you try to come up with a
patch that does that?
Thanks,
Alex
Powered by blists - more mailing lists