linux-kernel - Re: [PATCH v1 1/1] lib/crypto: tests: Add KUnit tests for AES

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20260117235906.GD74518@quark>
Date: Sat, 17 Jan 2026 15:59:06 -0800
From: Eric Biggers <ebiggers@...nel.org>
To: David Laight <david.laight.linux@...il.com>
Cc: Holger Dengler <dengler@...ux.ibm.com>,
	Ard Biesheuvel <ardb@...nel.org>,
	"Jason A . Donenfeld" <Jason@...c4.com>,
	Herbert Xu <herbert@...dor.apana.org.au>,
	Harald Freudenberger <freude@...ux.ibm.com>,
	linux-kernel@...r.kernel.org, linux-crypto@...r.kernel.org
Subject: Re: [PATCH v1 1/1] lib/crypto: tests: Add KUnit tests for AES

On Fri, Jan 16, 2026 at 10:30:15PM +0000, David Laight wrote:
> It may not matter what you do to get the cpu speed fixed.
> Looping calling ktime_get_ns() for 'long enough' should do it.

For the CPU frequency, sure.  But as I mentioned, the warm-up loop is
also intended to load the target code into cache, as the benchmarks are
intended to measure the warm cache case.  Yes, that part only needs one
call, but the loop accomplishes both.

> That would be test independent but the 'long enough' very
> cpu dependent.
> The benchmarks probably ought to have some common API - even if it
> just in the kunit code.
> 
> The advantage of counting cpu clocks is the frequency then doesn't
> matter as much - L1 cache miss timings might change.
> 
> The difficulty is finding a cpu clock counter. Architecture dependent
> and may not exist (you don't want the fixed frequency 'sanitised' TSC).

Yes, not all architectures supported by Linux have a high-resolution
timer or cycle counter.  IIUC, for some the best resolution available is
that of "jiffies", which can increment as infrequently as once per 10
ms.  On such kernels, the benchmark naturally needs to run for
significantly longer than that to get a reasonably accurate time.

I certainly agree that the benchmarking code I've written is ad-hoc.
But at the same time, there's a bit more reasoning behind it than you
might think.  The "obvious" improvements suggested in this thread
(disabling IRQs, doing only 1 warm-up iteration, doing only 100
iterations) make assumptions that are not true on many systems.

- Eric