lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHmME9rqG56i+TfOhY-yt52XZrFRRTv0Vwr27qn5yL0=OpQa-A@mail.gmail.com>
Date:   Wed, 22 Apr 2020 14:17:28 -0600
From:   "Jason A. Donenfeld" <Jason@...c4.com>
To:     Ard Biesheuvel <ardb@...nel.org>
Cc:     Eric Biggers <ebiggers@...nel.org>,
        Herbert Xu <herbert@...dor.apana.org.au>,
        Linux Crypto Mailing List <linux-crypto@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH crypto-stable] crypto: arch/lib - limit simd usage to
 PAGE_SIZE chunks

On Wed, Apr 22, 2020 at 1:51 PM Jason A. Donenfeld <Jason@...c4.com> wrote:
>
> On Wed, Apr 22, 2020 at 1:39 AM Ard Biesheuvel <ardb@...nel.org> wrote:
> >
> > On Wed, 22 Apr 2020 at 09:32, Jason A. Donenfeld <Jason@...c4.com> wrote:
> > >
> > > On Tue, Apr 21, 2020 at 10:04 PM Eric Biggers <ebiggers@...nel.org> wrote:
> > > > Seems this should just be a 'while' loop?
> > > >
> > > >         while (bytes) {
> > > >                 unsigned int todo = min_t(unsigned int, PAGE_SIZE, bytes);
> > > >
> > > >                 kernel_neon_begin();
> > > >                 chacha_doneon(state, dst, src, todo, nrounds);
> > > >                 kernel_neon_end();
> > > >
> > > >                 bytes -= todo;
> > > >                 src += todo;
> > > >                 dst += todo;
> > > >         }
> > >
> > > The for(;;) is how it's done elsewhere in the kernel (that this patch
> > > doesn't touch), because then we can break out of the loop before
> > > having to increment src and dst unnecessarily. Likely a pointless
> > > optimization as probably the compiler can figure out how to avoid
> > > that. But maybe it can't. If you have a strong preference, I can
> > > reactor everything to use `while (bytes)`, but if you don't care,
> > > let's keep this as-is. Opinion?
> > >
> >
> > Since we're bikeshedding, I'd prefer 'do { } while (bytes);' here,
> > given that bytes is guaranteed to be non-zero before we enter the
> > loop. But in any case, I'd prefer avoiding for(;;) or while(1) where
> > we can.
>
> Okay, will do-while it up for v2.

I just sent v2 containing do-while, and I'm fine with that going in
that way. But just in the interest of curiosity in the pan-tone
palette, check this out:

https://godbolt.org/z/VxXien

It looks like on mine, the compiler avoids unnecessarily calling those
adds on the last iteration, but on the other hand, it results in an
otherwise unnecessary unconditional jump for the < 4096 case. Sort of
interesting. Arm64 code is more or less the same difference too.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ