[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200506051200.GA831492@ubuntu-s3-xlarge-x86>
Date: Tue, 5 May 2020 22:12:00 -0700
From: Nathan Chancellor <natechancellor@...il.com>
To: Arnd Bergmann <arnd@...db.de>
Cc: Herbert Xu <herbert@...dor.apana.org.au>,
"David S. Miller" <davem@...emloft.net>,
David Sterba <dsterba@...e.com>,
Horia Geantă <horia.geanta@....com>,
Eric Biggers <ebiggers@...gle.com>,
linux-crypto@...r.kernel.org, linux-kernel@...r.kernel.org,
clang-built-linux@...glegroups.com
Subject: Re: [PATCH] crypto: blake2b - Fix clang optimization for ARMv7-M
On Tue, May 05, 2020 at 03:53:45PM +0200, Arnd Bergmann wrote:
> When building for ARMv7-M, clang-9 or higher tries to unroll some loops,
> which ends up confusing the register allocator to the point of generating
> rather bad code and using more than the warning limit for stack frames:
>
> warning: stack frame size of 1200 bytes in function 'blake2b_compress' [-Wframe-larger-than=]
>
> Forcing it to not unroll the final loop avoids this problem.
>
> Fixes: 91d689337fe8 ("crypto: blake2b - add blake2b generic implementation")
> Signed-off-by: Arnd Bergmann <arnd@...db.de>
> ---
> crypto/blake2b_generic.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/crypto/blake2b_generic.c b/crypto/blake2b_generic.c
> index 1d262374fa4e..0ffd8d92e308 100644
> --- a/crypto/blake2b_generic.c
> +++ b/crypto/blake2b_generic.c
> @@ -129,7 +129,9 @@ static void blake2b_compress(struct blake2b_state *S,
> ROUND(9);
> ROUND(10);
> ROUND(11);
> -
> +#ifdef CONFIG_CC_IS_CLANG
Given your comment in the bug:
"The code is written to assume no loops are unrolled"
Does it make sense to make this unconditional and take compiler
heuristics out of it?
> +#pragma nounroll /* https://bugs.llvm.org/show_bug.cgi?id=45803 */
> +#endif
> for (i = 0; i < 8; ++i)
> S->h[i] = S->h[i] ^ v[i] ^ v[i + 8];
> }
> --
> 2.26.0
>
Powered by blists - more mailing lists