[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181205061054.GA26750@sol.localdomain>
Date: Tue, 4 Dec 2018 22:10:55 -0800
From: Eric Biggers <ebiggers@...nel.org>
To: Martin Willi <martin@...ongswan.org>
Cc: linux-crypto@...r.kernel.org,
Paul Crowley <paulcrowley@...gle.com>,
Milan Broz <gmazyland@...il.com>,
"Jason A . Donenfeld" <Jason@...c4.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 4/6] crypto: x86/chacha20 - add XChaCha20 support
Hi Martin,
On Sat, Dec 01, 2018 at 05:40:40PM +0100, Martin Willi wrote:
>
> > An SSSE3 implementation of single-block HChaCha20 is also added so
> > that XChaCha20 can use it rather than the generic
> > implementation. This required refactoring the ChaCha permutation
> > into its own function.
>
> > [...]
>
> > +ENTRY(chacha20_block_xor_ssse3)
> > + # %rdi: Input state matrix, s
> > + # %rsi: up to 1 data block output, o
> > + # %rdx: up to 1 data block input, i
> > + # %rcx: input/output length in bytes
> > +
> > + # x0..3 = s0..3
> > + movdqa 0x00(%rdi),%xmm0
> > + movdqa 0x10(%rdi),%xmm1
> > + movdqa 0x20(%rdi),%xmm2
> > + movdqa 0x30(%rdi),%xmm3
> > + movdqa %xmm0,%xmm8
> > + movdqa %xmm1,%xmm9
> > + movdqa %xmm2,%xmm10
> > + movdqa %xmm3,%xmm11
> > +
> > + mov %rcx,%rax
> > + call chacha20_permute
> > +
> > # o0 = i0 ^ (x0 + s0)
> > paddd %xmm8,%xmm0
> > cmp $0x10,%rax
> > @@ -189,6 +198,23 @@ ENTRY(chacha20_block_xor_ssse3)
> >
> > ENDPROC(chacha20_block_xor_ssse3)
> >
> > +ENTRY(hchacha20_block_ssse3)
> > + # %rdi: Input state matrix, s
> > + # %rsi: output (8 32-bit words)
> > +
> > + movdqa 0x00(%rdi),%xmm0
> > + movdqa 0x10(%rdi),%xmm1
> > + movdqa 0x20(%rdi),%xmm2
> > + movdqa 0x30(%rdi),%xmm3
> > +
> > + call chacha20_permute
>
> AFAIK, the general convention is to create proper stack frames using
> FRAME_BEGIN/END for non leaf-functions. Should chacha20_permute()
> callers do so?
>
Yes, I'll do that. (Ard suggested similarly in the arm64 version too.)
- Eric
Powered by blists - more mailing lists