[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171122212950.GA74584@gmail.com>
Date: Wed, 22 Nov 2017 13:29:50 -0800
From: Eric Biggers <ebiggers3@...il.com>
To: Ard Biesheuvel <ard.biesheuvel@...aro.org>
Cc: "linux-crypto@...r.kernel.org" <linux-crypto@...r.kernel.org>,
Herbert Xu <herbert@...dor.apana.org.au>,
Theodore Ts'o <tytso@....edu>,
"Jason A . Donenfeld" <Jason@...c4.com>,
Martin Willi <martin@...ongswan.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Eric Biggers <ebiggers@...gle.com>
Subject: Re: [PATCH 5/5] crypto: chacha20 - Fix keystream alignment for
chacha20_block()
On Wed, Nov 22, 2017 at 08:51:57PM +0000, Ard Biesheuvel wrote:
> On 22 November 2017 at 19:51, Eric Biggers <ebiggers3@...il.com> wrote:
> > From: Eric Biggers <ebiggers@...gle.com>
> >
> > When chacha20_block() outputs the keystream block, it uses 'u32' stores
> > directly. However, the callers (crypto/chacha20_generic.c and
> > drivers/char/random.c) declare the keystream buffer as a 'u8' array,
> > which is not guaranteed to have the needed alignment.
> >
> > Fix it by having both callers declare the keystream as a 'u32' array.
> > For now this is preferable to switching over to the unaligned access
> > macros because chacha20_block() is only being used in cases where we can
> > easily control the alignment (stack buffers).
> >
>
> Given this paragraph, I think we agree the correct way to fix this
> would be to make chacha20_block() adhere to its prototype, so if we
> deviate from that, there should be a good reason. On which
> architecture that cares about alignment is this expected to result in
> a measurable performance benefit?
>
Well, variables on the stack tend to be 4 or even 8-byte aligned anyway, so this
change probably doesn't make a difference in practice currently. But it still
should be fixed, in case it does become a problem.
We could certainly leave the type as u8 array and use put_unaligned_le32()
instead; that would be a simpler change. But that would be slower on
architectures where a potentially-unaligned access requires multiple
instructions.
Eric
Powered by blists - more mailing lists