lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Y0ddUkGg15kHCIiB@zx2c4.com>
Date:   Wed, 12 Oct 2022 18:35:30 -0600
From:   "Jason A. Donenfeld" <Jason@...c4.com>
To:     Robert Elliott <elliott@....com>
Cc:     herbert@...dor.apana.org.au, davem@...emloft.net,
        tim.c.chen@...ux.intel.com, ap420073@...il.com, ardb@...nel.org,
        linux-crypto@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 09/19] crypto: x86 - use common macro for FPU limit

On Wed, Oct 12, 2022 at 04:59:21PM -0500, Robert Elliott wrote:
> Use a common macro name (FPU_BYTES) for the limit of the number of bytes
> processed within kernel_fpu_begin and kernel_fpu_end rather than
> using SZ_4K (which is a signed value), or a magic value of 4096U.

Not sure I like this very much. The whole idea is that this is variable
per algorithm, since not all algorithms have the same performance
characteristics. So in that sense, it's better to put this close to
where it's actually used, rather than somewhere at the top of the file.
When you do that, it makes it seem like "FPU_BYTES" is some universal
constant, which of course it isn't.

Instead, declare this as an untyped enum value within the function. For
example:

diff --git a/arch/x86/crypto/blake2s-glue.c b/arch/x86/crypto/blake2s-glue.c
index aaba21230528..602883eee5f3 100644
--- a/arch/x86/crypto/blake2s-glue.c
+++ b/arch/x86/crypto/blake2s-glue.c
@@ -30,7 +30,8 @@ void blake2s_compress(struct blake2s_state *state, const u8 *block,
 		      size_t nblocks, const u32 inc)
 {
 	/* SIMD disables preemption, so relax after processing each page. */
-	BUILD_BUG_ON(SZ_4K / BLAKE2S_BLOCK_SIZE < 8);
+	enum { BLOCKS_PER_FPU = SZ_4K / BLAKE2S_BLOCK_SIZE };
+	BUILD_BUG_ON(BLOCKS_PER_FPU < 8);

 	if (!static_branch_likely(&blake2s_use_ssse3) || !may_use_simd()) {
 		blake2s_compress_generic(state, block, nblocks, inc);
@@ -38,8 +39,7 @@ void blake2s_compress(struct blake2s_state *state, const u8 *block,
 	}

 	do {
-		const size_t blocks = min_t(size_t, nblocks,
-					    SZ_4K / BLAKE2S_BLOCK_SIZE);
+		const size_t blocks = min_t(size_t, nblocks, BLOCKS_PER_FPU);

 		kernel_fpu_begin();
 		if (IS_ENABLED(CONFIG_AS_AVX512) &&
diff --git a/arch/x86/crypto/chacha_glue.c b/arch/x86/crypto/chacha_glue.c
index 7b3a1cf0984b..f8fd2b7025c1 100644
--- a/arch/x86/crypto/chacha_glue.c
+++ b/arch/x86/crypto/chacha_glue.c
@@ -142,12 +142,14 @@ EXPORT_SYMBOL(chacha_init_arch);
 void chacha_crypt_arch(u32 *state, u8 *dst, const u8 *src, unsigned int bytes,
 		       int nrounds)
 {
+	enum { BYTES_PER_FPU = SZ_4K };
+
 	if (!static_branch_likely(&chacha_use_simd) || !crypto_simd_usable() ||
 	    bytes <= CHACHA_BLOCK_SIZE)
 		return chacha_crypt_generic(state, dst, src, bytes, nrounds);

 	do {
-		unsigned int todo = min_t(unsigned int, bytes, SZ_4K);
+		unsigned int todo = min_t(unsigned int, bytes, BYTES_PER_FPU);

 		kernel_fpu_begin();
 		chacha_dosimd(state, dst, src, todo, nrounds);


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ