lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20260117235906.GD74518@quark>
Date: Sat, 17 Jan 2026 15:59:06 -0800
From: Eric Biggers <ebiggers@...nel.org>
To: David Laight <david.laight.linux@...il.com>
Cc: Holger Dengler <dengler@...ux.ibm.com>,
	Ard Biesheuvel <ardb@...nel.org>,
	"Jason A . Donenfeld" <Jason@...c4.com>,
	Herbert Xu <herbert@...dor.apana.org.au>,
	Harald Freudenberger <freude@...ux.ibm.com>,
	linux-kernel@...r.kernel.org, linux-crypto@...r.kernel.org
Subject: Re: [PATCH v1 1/1] lib/crypto: tests: Add KUnit tests for AES

On Fri, Jan 16, 2026 at 10:30:15PM +0000, David Laight wrote:
> It may not matter what you do to get the cpu speed fixed.
> Looping calling ktime_get_ns() for 'long enough' should do it.

For the CPU frequency, sure.  But as I mentioned, the warm-up loop is
also intended to load the target code into cache, as the benchmarks are
intended to measure the warm cache case.  Yes, that part only needs one
call, but the loop accomplishes both.

> That would be test independent but the 'long enough' very
> cpu dependent.
> The benchmarks probably ought to have some common API - even if it
> just in the kunit code.
> 
> The advantage of counting cpu clocks is the frequency then doesn't
> matter as much - L1 cache miss timings might change.
> 
> The difficulty is finding a cpu clock counter. Architecture dependent
> and may not exist (you don't want the fixed frequency 'sanitised' TSC).

Yes, not all architectures supported by Linux have a high-resolution
timer or cycle counter.  IIUC, for some the best resolution available is
that of "jiffies", which can increment as infrequently as once per 10
ms.  On such kernels, the benchmark naturally needs to run for
significantly longer than that to get a reasonably accurate time.

I certainly agree that the benchmarking code I've written is ad-hoc.
But at the same time, there's a bit more reasoning behind it than you
might think.  The "obvious" improvements suggested in this thread
(disabling IRQs, doing only 1 warm-up iteration, doing only 100
iterations) make assumptions that are not true on many systems.

- Eric

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ