linux-kernel - Re: [PATCH] riscv: lib: optimize strlen loop efficiency

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <581a8707-cb16-46d9-b6b5-8fa267383318@kylinos.cn>
Date: Thu, 15 Jan 2026 11:23:13 +0800
From: Feng Jiang <jiangfeng@...inos.cn>
To: Paul Walmsley <pjw@...nel.org>
Cc: palmer@...belt.com, aou@...s.berkeley.edu, alex@...ti.fr,
 samuel.holland@...ive.com, charlie@...osinc.com, conor.dooley@...rochip.com,
 linux-riscv@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] riscv: lib: optimize strlen loop efficiency

On 2026/1/15 10:03, Paul Walmsley wrote:
> On Thu, 18 Dec 2025, Feng Jiang wrote:
> 
>> Optimize the generic strlen implementation by using a pre-decrement
>> pointer. This reduces the loop body from 4 instructions to 3 and
>> eliminates the unconditional jump ('j').
>>
>> Old loop (4 instructions, 2 branches):
>>   1: lbu t0, 0(t1); beqz t0, 2f; addi t1, t1, 1; j 1b
>>
>> New loop (3 instructions, 1 branch):
>>   1: addi t1, t1, 1; lbu t0, 0(t1); bnez t0, 1b
>>
>> This change improves execution efficiency and reduces branch pressure
>> for systems without the Zbb extension.
> 
> Looks reasonable; do you have any benchmarks on hardware that you can 
> share?  Any reason why this patch stands alone and isn't rolled up as part 
> of your "optimize string function" series?

Thanks for the feedback.

This patch predates the rest of the series, which is why it wasn't included
in the 'optimize string function' rollup. At the time, I focused on correctness
testing and observed the improvement through rdcycle instruction counts.

Since the series still needs further refinement and may take a longer time to
complete, I was hoping this standalone optimization could be considered independently.
However, I am also happy to roll it into the series if you prefer.

-- 
With Best Regards,
Feng Jiang