[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251121000943.GC3532564@google.com>
Date: Fri, 21 Nov 2025 00:09:43 +0000
From: Eric Biggers <ebiggers@...nel.org>
To: David Howells <dhowells@...hat.com>
Cc: linux-crypto@...r.kernel.org, Herbert Xu <herbert@...dor.apana.org.au>,
Luis Chamberlain <mcgrof@...nel.org>,
Petr Pavlu <petr.pavlu@...e.com>,
Daniel Gomez <da.gomez@...nel.org>,
Sami Tolvanen <samitolvanen@...gle.com>,
"Jason A . Donenfeld" <Jason@...c4.com>,
Ard Biesheuvel <ardb@...nel.org>,
Stephan Mueller <smueller@...onox.de>,
Lukas Wunner <lukas@...ner.de>,
Ignat Korchagin <ignat@...udflare.com>, keyrings@...r.kernel.org,
linux-modules@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/4] lib/crypto: Add ML-DSA verification support
On Thu, Nov 20, 2025 at 09:10:00AM +0000, David Howells wrote:
> Eric Biggers <ebiggers@...nel.org> wrote:
>
> > - Is about 600 lines of source code instead of 4800.
>
> There's less shareable code for other algos that I'm sure people are going to
> ask for, but that's probably fine.
The "advanced" verification features that people could conceivably want
in the future (public key preloading, nonempty contexts, HashML-DSA,
external mu, incremental message hashing) would all be fairly
straightforward to add, in the event that that they ever become needed.
Signing support would of course be challenging. But that's expected,
and we should try to keep that out of the kernel anyway.
> > - Generates about 4 KB of object code instead of 28 KB.
> > - Uses 9-13 KB of memory to verify a signature instead of 31-84 KB.
>
> That's definitely good.
>
> > - Is 3-5% faster, depending on the ML-DSA parameter set.
>
> That's not quite what I see. For Leancrypto:
>
> # benchmark_mldsa44: 8672 ops/s
> # benchmark_mldsa65: 5470 ops/s
> # benchmark_mldsa87: 3350 ops/s
>
> For your implementation:
>
> # benchmark_mldsa44: 8707 ops/s
> # benchmark_mldsa65: 5423 ops/s
> # benchmark_mldsa87: 3352 ops/s
>
> This may reflect differences in CPU (mine's an i3-4170).
>
> The numbers are pretty stable with the cpu frequency governor set to
> performance and without rebooting betweentimes.
>
> Interesting that your mldsa44 is consistently faster, but your mldsa65 is
> consistently slower. mldsa87 is consistently about the same.
>
> I don't think the time differences are particularly significant.
Sure, I had just tested one CPU. Slightly different results on
different CPUs are expected. It's also expected that the ops/s for
verification in a loop is still in roughly the same ballpark as your
integration of leancrypto (or the Dilithium reference code which
leancrypto seems to be based on, for that matter). There aren't too
many ways to implement the most time-consuming parts. Generally,
arch-optimized code would be needed to do significantly better.
Of course, the greatly reduced icache and dcache usage is much more
important for performance. But that doesn't show up in the "just verify
the same signature in a loop repeatedly" benchmark.
I'll clarify that part of the commit message accordingly.
- Eric
Powered by blists - more mailing lists