lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <86d3947bce1f49c395224998e7d65dc2@AcuMS.aculab.com>
Date: Sun, 17 Dec 2023 17:00:48 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Ivan Orlov' <ivan.orlov0322@...il.com>, "paul.walmsley@...ive.com"
	<paul.walmsley@...ive.com>, "palmer@...belt.com" <palmer@...belt.com>,
	"aou@...s.berkeley.edu" <aou@...s.berkeley.edu>
CC: "conor.dooley@...rochip.com" <conor.dooley@...rochip.com>,
	"ajones@...tanamicro.com" <ajones@...tanamicro.com>, "samuel@...lland.org"
	<samuel@...lland.org>, "alexghiti@...osinc.com" <alexghiti@...osinc.com>,
	"linux-riscv@...ts.infradead.org" <linux-riscv@...ts.infradead.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"skhan@...uxfoundation.org" <skhan@...uxfoundation.org>
Subject: RE: [PATCH] riscv: lib: Optimize 'strlen' function

From: Ivan Orlov
> Sent: 13 December 2023 15:46
> 
> The current non-ZBB implementation of 'strlen' function iterates the
> memory bytewise, looking for a zero byte. It could be optimized to use
> the wordwise iteration instead, so we will process 4/8 bytes of memory
> at a time.
...
> 1. If the address is unaligned, iterate SZREG - (address % SZREG) bytes
> to align it.

An alternative is to mask the address and 'or' in non-zero bytes
into the first word - might be faster.

...
> Here you can find the benchmarking results for the VisionFive2 board
> comparing the old and new implementations of the strlen function.
> 
> Size: 1 (+-0), mean_old: 673, mean_new: 666
> Size: 2 (+-0), mean_old: 672, mean_new: 676
> Size: 4 (+-0), mean_old: 685, mean_new: 659
> Size: 8 (+-0), mean_old: 682, mean_new: 673
> Size: 16 (+-0), mean_old: 718, mean_new: 694
...

Is that 32bit or 64bit?
The word-at-a-time strlen() is typically not worth it for 32bit.

I'd also guess that pretty much all the calls in-kernel are short.
You might try counting as: histogram[ilog2(strlen_result)]++
and seeing what it shows for some workload.
I bet you (a beer if I see you!) that you won't see many over 1k.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ