lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHmME9rtrnaU1LHM_dZ1WNw-Td4UhOz0kjmqb-e-0ykOnW7XyA@mail.gmail.com>
Date:   Wed, 15 Aug 2018 23:31:11 -0700
From:   "Jason A. Donenfeld" <Jason@...c4.com>
To:     Eric Biggers <ebiggers@...nel.org>
Cc:     Eric Biggers <ebiggers3@...il.com>,
        Linux Crypto Mailing List <linux-crypto@...r.kernel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Netdev <netdev@...r.kernel.org>,
        David Miller <davem@...emloft.net>,
        Andrew Lutomirski <luto@...nel.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Samuel Neves <sneves@....uc.pt>,
        "Daniel J . Bernstein" <djb@...yp.to>,
        Tanja Lange <tanja@...erelliptic.org>,
        Jean-Philippe Aumasson <jeanphilippe.aumasson@...il.com>,
        Karthikeyan Bhargavan <karthik.bhargavan@...il.com>
Subject: Re: [PATCH v1 2/3] zinc: Introduce minimal cryptography library

Hi Eric,

On Tue, Aug 14, 2018 at 2:12 PM Eric Biggers <ebiggers@...nel.org> wrote:
> On ARM Cortex-A7, OpenSSL's ChaCha20 implementation is 13.9 cpb (cycles per
> byte), whereas Linux's is faster: 11.9 cpb.
>
> The reason Linux's ChaCha20 NEON implementation is faster than OpenSSL's
>
> I understand there are tradeoffs, and different implementations can be faster on
> different CPUs.
>
> So if your proposal goes in, I'd likely need to write a patch
> to get the old performance back, at least on Cortex-A7...

Yes, absolutely. Different CPUs behave differently indeed, but if you
have improvements for hardware that matters to you, we should
certainly incorporate these, and also loop Andy Polyakov in (I've
added him to the CC for the WIP v2). ChaCha is generally pretty
obvious, but for big integer algorithms -- like Poly1305 and
Curve25519 -- I think it's all the more important to involve Andy and
the rest of the world in general, so that Linux benefits from bug
research and fuzzing in places that are typically and classically
prone to nasty issues. In other words, let's definitely incorporate
your improvements after the patchset goes in, and at the same time
we'll try to bring Andy and others into the fold, where our
improvements can generally track each others.

> Also, I don't know whether Andy P. considered the 4xNEON implementation
> technique.  It could even be fastest on other ARM CPUs too, I don't know.

After v2, when he's CC'd in, let's plan to start discussing this with him.

Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ