lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 21 Oct 2020 21:34:50 -0700
From:   Eric Biggers <ebiggers@...nel.org>
To:     Arvind Sankar <nivedita@...m.mit.edu>
Cc:     Herbert Xu <herbert@...dor.apana.org.au>,
        "David S. Miller" <davem@...emloft.net>,
        "linux-crypto@...r.kernel.org" <linux-crypto@...r.kernel.org>,
        David Laight <David.Laight@...lab.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 6/6] crypto: lib/sha - Combine round constants and
 message schedule

On Tue, Oct 20, 2020 at 04:39:57PM -0400, Arvind Sankar wrote:
> Putting the round constants and the message schedule arrays together in
> one structure saves one register, which can be a significant benefit on
> register-constrained architectures. On x86-32 (tested on Broadwell
> Xeon), this gives a 10% performance benefit.
> 
> Signed-off-by: Arvind Sankar <nivedita@...m.mit.edu>
> Suggested-by: David Laight <David.Laight@...LAB.COM>
> ---
>  lib/crypto/sha256.c | 49 ++++++++++++++++++++++++++-------------------
>  1 file changed, 28 insertions(+), 21 deletions(-)
> 
> diff --git a/lib/crypto/sha256.c b/lib/crypto/sha256.c
> index 3a8802d5f747..985cd0560d79 100644
> --- a/lib/crypto/sha256.c
> +++ b/lib/crypto/sha256.c
> @@ -29,6 +29,11 @@ static const u32 SHA256_K[] = {
>  	0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208, 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2,
>  };
>  
> +struct KW {
> +	u32 K[64];
> +	u32 W[64];
> +};

Note that this doubles the stack usage from 256 to 512 bytes.  That's pretty
large for kernel code, especially when compiler options can increase the stack
usage well beyond the "expected" value.

So unless this gives a big performance improvement on architectures other than
32-bit x86 (which people don't really care about these days), we probably
shouldn't do this.

FWIW, it's possible to reduce the length of 'W' to 16 words by computing the
next W value just before each round 16-63, or by computing the next W values in
batches of 16 before rounds 16, 32, and 48.  (This is similar to what lib/sha1.c
does for SHA-1.)  In a quick userspace benchmark that seems to reduce
performance by about 25% on x86_64, though.

- Eric

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ