lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 7 Aug 2020 12:33:33 -0700
From:   Andy Lutomirski <>
To:     Linus Torvalds <>
Cc:     Willy Tarreau <>, Marc Plumb <>,
        Theodore Ts'o <>, Netdev <>,
        Amit Klein <>,
        Eric Dumazet <>,
        "Jason A. Donenfeld" <>,
        Andrew Lutomirski <>,
        Kees Cook <>,
        Thomas Gleixner <>,
        Peter Zijlstra <>,
        stable <>
Subject: Re: Flaw in "random32: update the net random state on interrupt and activity"

> On Aug 7, 2020, at 12:21 PM, Linus Torvalds <> wrote:
> On Fri, Aug 7, 2020 at 12:08 PM Andy Lutomirski <> wrote:
>> 4 cycles per byte on Core 2
> I took the reference C implementation as-is, and just compiled it with
> O2, so my numbers may not be what some heavily optimized case does.
> But it was way more than that, even when amortizing for "only need to
> do it every 8 cases". I think the 4 cycles/byte might be some "zero
> branch mispredicts" case when you've fully unrolled the thing, but
> then you'll be taking I$ misses out of the wazoo, since by definition
> this won't be in your L1 I$ at all (only called every 8 times).
> Sure, it might look ok on microbenchmarks where it does stay hot the
> cache all the time, but that's not realistic. I

No one said we have to do only one ChaCha20 block per slow path hit.  In fact, the more we reduce the number of rounds, the more time we spend on I$ misses, branch mispredictions, etc, so reducing rounds may be barking up the wrong tree entirely.  We probably don’t want to have more than one page 

I wonder if AES-NI adds any value here.  AES-CTR is almost a drop-in replacement for ChaCha20, and maybe the performance for a cache-cold short run is better.

Powered by blists - more mailing lists