lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20260123085841.212468-1-jiangfeng@kylinos.cn>
Date: Fri, 23 Jan 2026 16:58:33 +0800
From: Feng Jiang <jiangfeng@...inos.cn>
To: pjw@...nel.org,
	palmer@...belt.com,
	aou@...s.berkeley.edu,
	alex@...ti.fr,
	akpm@...ux-foundation.org,
	kees@...nel.org,
	andy@...nel.org,
	jiangfeng@...inos.cn,
	ebiggers@...nel.org,
	martin.petersen@...cle.com,
	mingo@...nel.org,
	charlie@...osinc.com,
	conor.dooley@...rochip.com,
	samuel.holland@...ive.com,
	linus.walleij@...aro.org,
	nathan@...nel.org
Cc: linux-riscv@...ts.infradead.org,
	linux-kernel@...r.kernel.org,
	linux-hardening@...r.kernel.org
Subject: [PATCH v4 0/8] riscv: optimize string functions and add kunit tests

This series provides optimized implementations of strnlen(), strchr(),
and strrchr() for the RISC-V architecture. The strnlen() implementation
is derived from the existing optimized strlen(). For strchr() and strrchr(),
the current versions use simple byte-by-byte assembly logic, which
will serve as a baseline for future Zbb-based optimizations.

The patch series is organized into three parts:
1. Correctness Testing: The first three patches add KUnit test cases
   for strlen(), strnlen(), and strrchr() to ensure the baseline and optimized
   versions are functionally correct.
2. Benchmarking Tool: Patches 4 and 5 extend string_kunit to include
   performance measurement capabilities, allowing for comparative
   analysis within the KUnit environment.
3. Architectural Optimizations: The final three patches introduce the
   RISC-V specific assembly implementations.

Following suggestions from Andy Shevchenko, performance benchmarks have
been added to string_kunit.c to provide quantifiable evidence of the
improvements. Andy provided many specific comments on the implementation
of the benchmark logic, which is also inspired by Eric Biggers'
crc_benchmark(). Performance was measured in a QEMU TCG (rv64) environment,
comparing the generic C implementation with the new RISC-V assembly versions.

Performance Summary (Improvement %):
---------------------------------------------------------------
Function  |  16 B (Short) |  512 B (Mid) |  4096 B (Long)
---------------------------------------------------------------
strnlen   |    +72.6%     |   +350.1%    |    +427.5%
strchr    |    +3.6%      |   +3.5%      |    -0.3%
strrchr   |    +5.3%      |   +5.8%      |    +0.8%
---------------------------------------------------------------
The benchmarks can be reproduced by enabling CONFIG_STRING_KUNIT_BENCH
and running: ./tools/testing/kunit/kunit.py run --arch=riscv \
--cross_compile=riscv64-linux-gnu- --kunitconfig=my_string.kunitconfig \
--raw_output

The strnlen() implementation leverages the Zbb 'orc.b' instruction and
word-at-a-time logic, showing significant gains as the string length
increases. For strchr() and strrchr(), the handwritten assembly reduces
fixed overhead by eliminating stack frame management. The gain is most
prominent on short strings where function call overhead dominates,
while the performance converges with the C implementation for longer
strings in the TCG environment.

I would like to thank Andy Shevchenko for the suggestion to add benchmarks
and for his detailed feedback on the test framework, and Eric Biggers for
the benchmarking approach. I am also grateful to Qingfang Deng for providing
the optimized implementation logic for strnlen(). Thanks also to Joel Stanley
for testing support and feedback, and to David Laight for his suggestions
regarding performance measurement.

Changes:

v4:
- Refine formatting and terminology:
  - Refer to '\0' as NUL.
  - Append parentheses () when referencing function names.
  - Ensure trailing commas are present in initializers.
  - Reorder local variable declarations to follow the "reverse Xmas tree" 
    style. (Style-only change; kept existing Acked-by tags).
- Improve documentation: Refine comments and commit messages for better 
  clarity.
- Improve readability by using (1 * MEGA) instead of 1000000UL.
- Replace max_t() with max() where type-casting is unnecessary.
- Simplify the return value check for kunit_kzalloc() in 
  alloc_max_bench_buffer().
- Remove redundant NUL-terminator handling in STRING_BENCH_BUF().
- Optimize strnlen() implementation by replacing bleu/bgeu instructions 
  with minu, as suggested by Qingfang Deng.
- Remove incorrect Suggested-by tags from certain patches.
- Drop Tested-by tags for benchmark-related patches due to significant 
  framework changes since v3.
- Re-run all tests and updated the performance data in the documentation.

v3:
- Re-implement benchmark logic inspired by crc_benchmark().
- Add 'len - 2' test case to strnlen correctness tests.
- Incorporate detailed benchmark data into individual commit messages.

v2: 
- Refactored lib/string.c to export __generic_* functions and added
  corresponding functional/performance tests for strnlen, strchr,
  and strrchr (Andy Shevchenko).
- Replaced magic numbers with STRING_TEST_MAX_LEN etc. (Andy Shevchenko).

v1: Initial submission.

---

Feng Jiang (8):
  lib/string_kunit: add correctness test for strlen()
  lib/string_kunit: add correctness test for strnlen()
  lib/string_kunit: add correctness test for strrchr()
  lib/string_kunit: add performance benchmark for strlen()
  lib/string_kunit: extend benchmarks to strnlen() and chr searches
  riscv: lib: add strnlen() implementation
  riscv: lib: add strchr() implementation
  riscv: lib: add strrchr() implementation

 arch/riscv/include/asm/string.h |   9 ++
 arch/riscv/lib/Makefile         |   3 +
 arch/riscv/lib/strchr.S         |  35 +++++
 arch/riscv/lib/strnlen.S        | 164 ++++++++++++++++++++
 arch/riscv/lib/strrchr.S        |  37 +++++
 arch/riscv/purgatory/Makefile   |  11 +-
 lib/Kconfig.debug               |  11 ++
 lib/tests/string_kunit.c        | 265 ++++++++++++++++++++++++++++++++
 8 files changed, 534 insertions(+), 1 deletion(-)
 create mode 100644 arch/riscv/lib/strchr.S
 create mode 100644 arch/riscv/lib/strnlen.S
 create mode 100644 arch/riscv/lib/strrchr.S

-- 
2.25.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ