lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZmL8lZo-h0FYwfNc@gondor.apana.org.au>
Date: Fri, 7 Jun 2024 20:27:01 +0800
From: Herbert Xu <herbert@...dor.apana.org.au>
To: Eric Biggers <ebiggers@...nel.org>
Cc: linux-crypto@...r.kernel.org, x86@...nel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5 0/2] x86_64 AES-GCM improvements

Eric Biggers <ebiggers@...nel.org> wrote:
> This patchset adds a VAES and AVX512 / AVX10 implementation of AES-GCM
> (Galois/Counter Mode), which improves AES-GCM performance by up to 162%.
> In addition, it replaces the old AES-NI GCM code from Intel with new
> code that is slightly faster and fixes a number of issues including the
> massive binary size of over 250 KB.  See the patches for details.
> 
> The end state of the x86_64 AES-GCM assembly code is that we end up with
> two assembly files, one that generates AES-NI code with or without AVX,
> and one that generates VAES code with AVX512 / AVX10 with 256-bit or
> 512-bit vectors.  There's no support for VAES alone (without AVX512 /
> AVX10).  This differs slightly from what I did with AES-XTS where one
> file generates both AVX and AVX512 / AVX10 code including code using
> VAES alone (without AVX512 / AVX10), and another file generates non-AVX
> code only.  For now this seems like the right choice for each particular
> algorithm, though, based on how much being limited to 16 SIMD registers
> and 128-bit vectors resulted in some significantly different design
> choices for AES-GCM, but not quite as much for AES-XTS.  CPUs shipping
> with VAES alone also seems to be a temporary thing, so we perhaps
> shouldn't go too much out of our way to support that combination.
> 
> Changed in v5:
> - Fixed sparse warnings in gcm_setkey()
> - Fixed some comments in aes-gcm-aesni-x86_64.S
> 
> Changed in v4:
> - Added AES-NI rewrite patch.
> - Adjusted the VAES-AVX10 patch slightly to make it possible to cleanly
>  add the AES-NI support on top of it.
> 
> Changed in v3:
> - Optimized the finalization code slightly.
> - Fixed a minor issue in my userspace benchmark program (guard page
>  after key struct made "AVX512_Cloudflare" extra slow on some input
>  lengths) and regenerated tables 3-4.  Also upgraded to Emerald Rapids.
> - Eliminated an instruction from _aes_gcm_precompute.
> 
> Changed in v2:
> - Additional assembly optimizations
> - Improved some comments
> - Aligned key struct to 64 bytes
> - Added comparison with Cloudflare's implementation of AES-GCM
> - Other cleanups
> 
> Eric Biggers (2):
>  crypto: x86/aes-gcm - add VAES and AVX512 / AVX10 optimized AES-GCM
>  crypto: x86/aes-gcm - rewrite the AES-NI optimized AES-GCM
> 
> arch/x86/crypto/Kconfig                  |    1 +
> arch/x86/crypto/Makefile                 |    8 +-
> arch/x86/crypto/aes-gcm-aesni-x86_64.S   | 1128 +++++++++
> arch/x86/crypto/aes-gcm-avx10-x86_64.S   | 1222 ++++++++++
> arch/x86/crypto/aesni-intel_asm.S        | 1503 +-----------
> arch/x86/crypto/aesni-intel_avx-x86_64.S | 2804 ----------------------
> arch/x86/crypto/aesni-intel_glue.c       | 1269 ++++++----
> 7 files changed, 3125 insertions(+), 4810 deletions(-)
> create mode 100644 arch/x86/crypto/aes-gcm-aesni-x86_64.S
> create mode 100644 arch/x86/crypto/aes-gcm-avx10-x86_64.S
> delete mode 100644 arch/x86/crypto/aesni-intel_avx-x86_64.S
> 
> 
> base-commit: aabbf2135f9a9526991f17cb0c78cf1ec878f1c2

All applied.  Thanks.
-- 
Email: Herbert Xu <herbert@...dor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ