lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <29e766ca-54e4-453d-9dfc-ea47e2a1f860@linux.ibm.com>
Date: Tue, 21 Oct 2025 09:24:06 +0200
From: Holger Dengler <dengler@...ux.ibm.com>
To: Eric Biggers <ebiggers@...nel.org>
Cc: David Howells <dhowells@...hat.com>, Ard Biesheuvel <ardb@...nel.org>,
        "Jason A . Donenfeld" <Jason@...c4.com>, linux-kernel@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org, linux-s390@...r.kernel.org,
        linux-crypto@...r.kernel.org,
        Harald Freudenberger <freude@...ux.ibm.com>
Subject: Re: [PATCH 15/17] lib/crypto: s390/sha3: Migrate optimized code into
 library

On 20/10/2025 19:57, Eric Biggers wrote:
> On Mon, Oct 20, 2025 at 04:00:42PM +0200, Holger Dengler wrote:
>> On 20/10/2025 02:50, Eric Biggers wrote:
>>> Instead of exposing the s390-optimized SHA-3 code via s390-specific
>>> crypto_shash algorithms, instead just implement the sha3_absorb_blocks()
>>> and sha3_keccakf() library functions.  This is much simpler, it makes
>>> the SHA-3 library functions be s390-optimized, and it fixes the
>>> longstanding issue where the s390-optimized SHA-3 code was disabled by
>>> default.  SHA-3 still remains available through crypto_shash, but
>>> individual architectures no longer need to handle it.
>>>
>>> Note that the existing code used both CPACF_KIMD_SHA3_224 and
>>> CPACF_KIMD_SHA3_256 after checking for just CPACF_KIMD_SHA3_256, and
>>> similarly for 384 and 512.  I've preserved that behavior.
>>>
>>> Signed-off-by: Eric Biggers <ebiggers@...nel.org>
>> The current code also cover a performance feature, which allows (on
>> supported hardware, e.g. z17) to skip the ICV initialization.
> 
> I'm not sure if by "ICV" you mean "Integrity Check Value" or "Initial
> Chaining Value", but SHA-3 doesn't have either of those.  It just starts
> with a state of all zeroes.  I assume that skipping the
> zero-initialization of the state is what you're referring to?

I meant "Initial Chaining Value". On s390, this memory is set to zero by the
KIMD/KLMD instructions so that it can be skipped in the init() calls.

>> support has been introduced with 88c02b3f79a6 ("s390/sha3: Support
>> sha3 performance enhancements"). Unfortunately, this patch removes
>> this support. Was this intended?
> 
> For now, yes.  I should have explained more in the patch, sorry.
> 
> As currently proposed, lib/crypto/sha3.c supports arch-specific
> overrides of sha3_absorb_blocks() and sha3_keccakf().  Those cover the
> Keccak-f permutation which is by far the most performance critical part.
> This strategy is working well in the SHA-2, SHA-1, and MD5 libraries,
> which support the same level of arch overrides.
> 
> We could update lib/crypto/sha3.c to allow architectures to override
> more of the code.  But we need to consider the tradeoffs:
> 
> - Risk of bugs.  QEMU doesn't support the s390 SHA-3 instructions, so no
>   one except the s390 folks can test the code.  I can try to write code
>   for you, but I can't test it.  And the s390 SHA-3 code has had bugs;
>   see commits 992b7066800f, 68279380266a5, 73c2437109c3.
> 
>   The first priority should be correctness.

Let me figure out, if me and my colleagues can do the testing for you.
Unfortunately, I'll be unavailable for the next two weeks. But I'll come back
with a solution for the testing.

> - The proposed change to the init functions would cause the format of
>   'struct __sha3_ctx' to be architecture-dependent.  While we can do
>   that if really needed, it's something that's best avoided for
>   simplicity.  It opens up more opportunity for error.

Would it make sense to provide a delayed zeroize mechanism of the initial
chaining value (ICV), and architectures may implement it or not? The feature
on s390 is exactly that: a delayed zeroize of the ICV, done by the instruction.

> - As I mentioned, Keccak-f is by far the most performance critical part
>   anyway.  The initial state is just all zeroes, and initializing it is
>   very lightweight.  Also consider that these contexts are often on the
>   stack, and people increasingly set the "init all stack variables to
>   zero" kernel hardening option anyway.
> 
>   I'll also note that commit 88c02b3f79a6 has no performance data in it.
>   So it's not clear that it actually helped much.
> 
> - The library has an optimization to greatly reduce the size of the
>   context: instead of buffering data separately, it just XOR's data into
>   the state.  So, if there's a sha3_*_init() followed by a sha3_update()
>   of less than 1 block, it will have to initialize the state anyway.  We
>   can delay it until that point on s390.  But again: complexity.
> 
> - These potential additional s390 optimizations would presumably help
>   the most on short messages.  However, on short messages, merely
>   switching to the library often gives a large performance improvement
>   due to eliminating the very slow call to crypto_alloc_shash().  That's
>   actually a lot more important.
> 
> I would suggest that we drop the sha3_*_init() optimization from
> consideration for now.  Providing overrides for the one-shot functions
> sha3_{224,256,384,512}() should be simpler as well as possibly a bit
> more useful, and I would suggest exploring that.

Ok, sounds reasonable.

> I guess I can try to write the code for you again.  But again, without
> QEMU support I cannot test it.  The first priority in cryptography code
> is correctness, so that's not a great position to be in.
> 
> Note that for new optimized code I'm requiring QEMU support for the
> instructions it uses.  This one would only be allowed because code that
> used these instructions already existed in arch/s390/crypto/.

Ok, I'll come back to you on that.

>> Please also add me and Harald Freudenberger to the cc: list for this patch.
> 
> Will do, thanks.
> 
> - Eric

-- 
Mit freundlichen Grüßen / Kind regards
Holger Dengler
--
IBM Systems, Linux on IBM Z Development
dengler@...ux.ibm.com


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ