lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200506051200.GA831492@ubuntu-s3-xlarge-x86>
Date:   Tue, 5 May 2020 22:12:00 -0700
From:   Nathan Chancellor <natechancellor@...il.com>
To:     Arnd Bergmann <arnd@...db.de>
Cc:     Herbert Xu <herbert@...dor.apana.org.au>,
        "David S. Miller" <davem@...emloft.net>,
        David Sterba <dsterba@...e.com>,
        Horia Geantă <horia.geanta@....com>,
        Eric Biggers <ebiggers@...gle.com>,
        linux-crypto@...r.kernel.org, linux-kernel@...r.kernel.org,
        clang-built-linux@...glegroups.com
Subject: Re: [PATCH] crypto: blake2b - Fix clang optimization for ARMv7-M

On Tue, May 05, 2020 at 03:53:45PM +0200, Arnd Bergmann wrote:
> When building for ARMv7-M, clang-9 or higher tries to unroll some loops,
> which ends up confusing the register allocator to the point of generating
> rather bad code and using more than the warning limit for stack frames:
> 
> warning: stack frame size of 1200 bytes in function 'blake2b_compress' [-Wframe-larger-than=]
> 
> Forcing it to not unroll the final loop avoids this problem.
> 
> Fixes: 91d689337fe8 ("crypto: blake2b - add blake2b generic implementation")
> Signed-off-by: Arnd Bergmann <arnd@...db.de>
> ---
>  crypto/blake2b_generic.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/crypto/blake2b_generic.c b/crypto/blake2b_generic.c
> index 1d262374fa4e..0ffd8d92e308 100644
> --- a/crypto/blake2b_generic.c
> +++ b/crypto/blake2b_generic.c
> @@ -129,7 +129,9 @@ static void blake2b_compress(struct blake2b_state *S,
>  	ROUND(9);
>  	ROUND(10);
>  	ROUND(11);
> -
> +#ifdef CONFIG_CC_IS_CLANG

Given your comment in the bug:

"The code is written to assume no loops are unrolled"

Does it make sense to make this unconditional and take compiler
heuristics out of it?

> +#pragma nounroll /* https://bugs.llvm.org/show_bug.cgi?id=45803 */
> +#endif
>  	for (i = 0; i < 8; ++i)
>  		S->h[i] = S->h[i] ^ v[i] ^ v[i + 8];
>  }
> -- 
> 2.26.0
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ