[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHmME9qNn5aRgtbV3bAsz3xW1A49a7RMMkOzGruBUPzLVUxVNg@mail.gmail.com>
Date: Thu, 27 Sep 2018 02:04:56 +0200
From: "Jason A. Donenfeld" <Jason@...c4.com>
To: Ard Biesheuvel <ard.biesheuvel@...aro.org>
Cc: Herbert Xu <herbert@...dor.apana.org.au>,
Thomas Gleixner <tglx@...utronix.de>,
LKML <linux-kernel@...r.kernel.org>,
Netdev <netdev@...r.kernel.org>,
Linux Crypto Mailing List <linux-crypto@...r.kernel.org>,
David Miller <davem@...emloft.net>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Samuel Neves <sneves@....uc.pt>,
Andrew Lutomirski <luto@...nel.org>,
Jean-Philippe Aumasson <jeanphilippe.aumasson@...il.com>,
Russell King - ARM Linux <linux@...linux.org.uk>,
linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH net-next v6 07/23] zinc: ChaCha20 ARM and ARM64 implementations
On Wed, Sep 26, 2018 at 5:52 PM Ard Biesheuvel
<ard.biesheuvel@...aro.org> wrote:
>
> On Wed, 26 Sep 2018 at 17:50, Jason A. Donenfeld <Jason@...c4.com> wrote:
> >
> > On Wed, Sep 26, 2018 at 5:45 PM Jason A. Donenfeld <Jason@...c4.com> wrote:
> > > So what you have in mind is something like calling simd_relax() every
> > > 4096 bytes or so?
> >
> > That was actually pretty easy, putting together both of your suggestions:
> >
> > static inline bool chacha20_arch(struct chacha20_ctx *state, u8 *dst,
> > u8 *src, size_t len,
> > simd_context_t *simd_context)
> > {
> > while (len > PAGE_SIZE) {
> > chacha20_arch(state, dst, src, PAGE_SIZE, simd_context);
> > len -= PAGE_SIZE;
> > src += PAGE_SIZE;
> > dst += PAGE_SIZE;
> > simd_relax(simd_context);
> > }
> > if (IS_ENABLED(CONFIG_KERNEL_MODE_NEON) && chacha20_use_neon &&
> > len >= CHACHA20_BLOCK_SIZE * 3 && simd_use(simd_context))
> > chacha20_neon(dst, src, len, state->key, state->counter);
> > else
> > chacha20_arm(dst, src, len, state->key, state->counter);
> >
> > state->counter[0] += (len + 63) / 64;
> > return true;
> > }
>
> Nice one :-)
>
> This works for me (but perhaps add a comment as well)
As elegant as my quick recursive solution was, gcc produced kind of
bad code from it, as you might expect. So I've implemented this using
a boring old loop that works the way it's supposed to. This is marked
for v7.
Powered by blists - more mailing lists