lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 5 May 2021 10:20:41 +0000
From:   David Laight <David.Laight@...LAB.COM>
To:     'Christophe JAILLET' <christophe.jaillet@...adoo.fr>,
        Eric Biggers <ebiggers@...nel.org>
CC:     "herbert@...dor.apana.org.au" <herbert@...dor.apana.org.au>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "linux-crypto@...r.kernel.org" <linux-crypto@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "kernel-janitors@...r.kernel.org" <kernel-janitors@...r.kernel.org>
Subject: RE: [RFC PATCH] crypto: arc4: Implement a version optimized for
 memory usage

From: Christophe JAILLET
> Sent: 04 May 2021 19:00
> 
> Le 04/05/2021 à 18:57, Eric Biggers a écrit :
> > On Sun, May 02, 2021 at 09:29:46PM +0200, Christophe JAILLET wrote:
> >> +#if defined(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS)
> >> +#define S_type	u8
> >> +#else
> >> +#define S_type	u32
> >> +#endif
> >> +
> >>   struct arc4_ctx {
> >> -	u32 S[256];
> >> +	S_type S[256];
> >>   	u32 x, y;
> >>   };
> >
> > Is it actually useful to keep both versions?  It seems we could just use the u8
> > version everywhere.  Note that there aren't actually any unaligned memory
> > accesses, so choosing the version conditionally on
> > CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS seems odd.  What are you trying to
> > determine by checking that?
> 
> Hi, this is a bad interpretation from me.
...
> 
> I wanted to avoid potential performance cost related to using char (i.e
> u8) instead of int (i.e. u32).
> On some architecture this could require some shift or masking or
> whatever to "unpack" the values of S.

The only architecture that Linux ran on where the hardware
did RMW accesses for byte writes was some very old alpha cpu.
Even more recent alpha supported byte writes to memory.

On many architectures (not x86 or arm) indexing a byte array
is better because it saves the instruction to multiply the index by 4.
On x86-64 you want to be using 'unsigned int' for array indexes
so the compiler doesn't have to emit the instruction to sign extend
a 32bit int to 64 bits (sometimes it knows it can't be needed).

FWIW with a modern compiler all those temporaries are pointless.
The number of lines of code can be halved.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ