lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMj1kXGf+0b=6kPAzzxgesaOYSJtzoL1oQyNqT2VrUkWFzwJzA@mail.gmail.com>
Date: Fri, 29 Aug 2025 18:05:42 +0200
From: Ard Biesheuvel <ardb@...nel.org>
To: Eric Biggers <ebiggers@...nel.org>
Cc: Honza Fikar <j.fikar@...il.com>, linux-crypto@...r.kernel.org, 
	linux-kernel@...r.kernel.org, "Jason A . Donenfeld" <Jason@...c4.com>, x86@...nel.org, 
	linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH 09/12] lib/crypto: blake2s: Always enable arch-optimized
 BLAKE2s code

On Fri, 29 Aug 2025 at 17:30, Eric Biggers <ebiggers@...nel.org> wrote:
>
> On Fri, Aug 29, 2025 at 03:08:56PM +0200, Honza Fikar wrote:
> > On Fri, Aug 29, 2025 at 2:54 PM Eric Biggers <ebiggers@...nel.org> wrote:
> >
> > > Currently, BLAKE2s support is always enabled ('obj-y'), since random.c
> > > uses it.  Therefore, the arch-optimized BLAKE2s code, which exists for
> > > ARM and x86_64, should be always enabled too.
> >
> > Maybe a stupid question: what about ARM64? The current NEON
> > implementation in kernel arch/arm/crypto/blake2s-core.S seems to be just
> > for ARM.
> >

That code is scalar not NEON, and is carefully tuned to make use of
the ARM barrel shifter, which does not exist on arm64.

> > While the upstream BLAKE2s with NEON is both for ARM and Aarch64 (ARM64):
> >
> > https://github.com/BLAKE2/BLAKE2/blob/master/neon
>
> There's no ARM64 optimized BLAKE2s code in the Linux kernel yet.  If
> it's useful, someone would need to contribute it.
>

NEON is cumbersome in the kernel so this only makes sense if it is
substantially more performant, and I'm skeptical that this is the
case, as you pointed out yourself in

commit 5172d322d34c30fb926b29aeb5a064e1fd8a5e13
Author: Eric Biggers <ebiggers@...gle.com>
Date:   Wed Dec 23 00:09:59 2020 -0800

    crypto: arm/blake2s - add ARM scalar optimized BLAKE2s

    Add an ARM scalar optimized implementation of BLAKE2s.

    NEON isn't very useful for BLAKE2s because the BLAKE2s block size
    is too small for NEON to help.  Each NEON instruction would depend
    on the previous one, resulting in poor performance.

Even if NEON code might be slightly faster on some cores, the fact
that it is sensitive to micro-architectural details makes it less
attractive.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ