lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 11 Apr 2018 16:20:42 +0000
From:   David Laight <David.Laight@...LAB.COM>
To:     'Salvatore Mesoraca' <s.mesoraca16@...il.com>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "kernel-hardening@...ts.openwall.com" 
        <kernel-hardening@...ts.openwall.com>,
        "linux-crypto@...r.kernel.org" <linux-crypto@...r.kernel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Herbert Xu <herbert@...dor.apana.org.au>,
        "Kees Cook" <keescook@...omium.org>,
        Eric Biggers <ebiggers3@...il.com>,
        "Laura Abbott" <labbott@...hat.com>
Subject: RE: [PATCH v2 0/2] crypto: removing various VLAs

From: Salvatore Mesoraca
> Sent: 09 April 2018 17:38
...
> > You can also do much better than allocating MAX_BLOCKSIZE + MAX_ALIGNMASK
> > bytes by requesting 'long' aligned on-stack memory.
> > The easiest way is to define a union like:
> >
> > union crypto_tmp {
> >         u8 buf[CRYPTO_MAX_TMP_BUF];
> >         long buf_align;
> > };
> >
> > Then in each function:
> >
> >         union tmp crypto_tmp;
> >         u8 *keystream = PTR_ALIGN(tmp.buf, alignmask + 1);
> >
> > I think CRYPTO_MAX_TMP_BUF needs to be MAX_BLOCKSIZE + MAX_ALIGNMASK - sizeof (long).
> 
> Yeah, that would be nice, it might save us 4-8 bytes on the stack.
> But I was thinking, wouldn't it be even better to do something like:
> 
> u8 buf[CRYPTO_MAX_TMP_BUF] __aligned(__alignof__(long));
> u8 *keystream = PTR_ALIGN(buf, alignmask + 1);
> 
> In this case __aligned should work, if I'm not missing some other
> subtle GCC caveat.

Thinking further, there is no point aligning the buffer to less than
the maximum alignment allowed - it just adds code.

So you end up with:
#define MAX_STACK_ALIGN __alignof__(long)  /* Largest type the compiler can align on stack */
#define CRYPTO_MAX_TMP_BUF (MAX_BLOCKSIZE + MAX_ALIGNMASK + 1 - MAX_STACK_ALIGN)
u8 buf[CRYPTO_MAX_TMP_BUF] __aligned(MAX_STACK_ALIGN);
u8 *keystream = PTR_ALIGN(buf, MAX_ALIGNMASK + 1);

The last two lines could be put into a #define of their own so that the 'call sites'
don't need to know the gory details of how the buffer is defined.

In principle you could just have:
u8 keystream[MAX_BLOCKSIZE] __aligned(MAX_ALIGNMASK + 1);

But that will go wrong if the stack alignment has gone wrong somewhere
and generates a double stack frame if the requested alignment is larger
than the expected stack alignment.

IIRC there is a gcc command line option to enforce stack alignment on
some/all function entry prologues. The gory details are held in some
old brain cells somewhere.

	David

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ