[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Yq+G+o1QQt75pwx/@zx2c4.com>
Date: Sun, 19 Jun 2022 22:28:42 +0200
From: "Jason A. Donenfeld" <Jason@...c4.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
the arch/x86 maintainers <x86@...nel.org>
Subject: Re: [GIT pull] locking/urgent for 5.19-rc3
Hey Linus,
On Sun, Jun 19, 2022 at 03:05:23PM -0500, Linus Torvalds wrote:
> On Sun, Jun 19, 2022 at 11:38 AM Jason A. Donenfeld <Jason@...c4.com> wrote:
> >
> > This was initially my concern too, which I expressed to Sebastian, but
> > he made the point that this area here is rather "special". Actually,
> > randomness isn't really required here.
>
> That wasn't really my point.
>
> My point was that there are a lot of uses of prandom_u32() and friends
> in random places. Just grepping for it, there's lots of different
> drivers that use it. Who knows what locking they have.
>
> Clearly nobody *thought* about it. This one issue is purely about RT
> correctness, but how about all the uses that just want a pseudo-random
> number and may have performance issues, or may be calling things so
> much that a lock is just bad.
>
> The thing is, that prandom code used to be FAST. Not just "no locks",
In my benchmarks, get_random_u32() is on par with the old prandom_u32()
now. And a large part of that is that get_random_u32() is almost always
lockless. Every once in a while it needs to refill its buffer, so it
uses get_random_bytes(). And guess what? That too is almost always
lockless. But every once in a while -- on the order of once a minute --
get_random_byes() takes an extremely short spinlock that just does one
block of chacha. And that's only if get_random_u32()'s buffer being
empty is the thing to trigger the reseeding there. So taken together,
this means there's a very short spinlock once per minute, and only if
the stars align. That means performance and lock contention here is
really not an issue.
That's not to say we couldn't optimize the whole thing even further. I
think we probably can, should it become necessary or desirable or even
simply a fun thing to do. But hard as I tried, I couldn't find anywhere
that this was a problem from a performance perspective. And in some real
world cases (e.g. network stack), the performance was a little better.
So far, the one category of gotchas are when this is used from inside of
a raw spinlock, because that makes RT upset. I've seen two cases of
this, both of which were trivial to resolve. If it balloons into lots of
cases, or if multiple hard to address categories emerge, then maybe
we'll need to look at it differently. But for now this seems like a very
manageable problem.
Jason
Powered by blists - more mailing lists