[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANn89iK3maLVo_G7MGswuXV0Og9tEFJxMZt+34ZKTo4zUNoLRw@mail.gmail.com>
Date: Sat, 1 Oct 2022 15:31:15 -0700
From: Eric Dumazet <edumazet@...gle.com>
To: "Jason A. Donenfeld" <Jason@...c4.com>
Cc: Christophe Leroy <christophe.leroy@...roup.eu>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
David Dworken <ddworken@...gle.com>,
Willem de Bruijn <willemb@...gle.com>,
"David S. Miller" <davem@...emloft.net>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Subject: Re: 126 ms irqsoff Latency - Possibly due to commit 190cc82489f4
("tcp: change source port randomizarion at connect() time")
On Sat, Oct 1, 2022 at 3:16 PM Jason A. Donenfeld <Jason@...c4.com> wrote:
>
> (CC+Sebastian)
>
> Hi Eric, Christophe,
>
> I'm trying to understand the context of this and whether/why there's a
> problem. Some overview on how get_random_bytes() works:
>
> Most of the time, get_random_bytes() is completely lockless and operates
> over per-CPU data structures. get_random_bytes() calls
> _get_random_bytes(), which calls crng_make_state(), and then operates
> over stack data to churn out some random bytes. crng_make_state() is
> where all the meat happens.
>
> In crng_make_state(), there are three unlikely conditionals where locks
> are taken. The first is:
>
> if (!crng_ready()) {
> ... do some expensive things involving locks ...
> ... but only during early boot before the rng is initialized ...
> }
>
> The second one is:
>
> if (unlikely(time_is_before_jiffies(READ_ONCE(base_crng.birth) + crng_reseed_interval()))) {
> ... do something less expensive involving locks ...
> ... which happens approximately once per minute ...
> }
>
> The third one is:
>
> if (unlikely(crng->generation != READ_ONCE(base_crng.generation))) {
> ... do something even less expensive involving locks ...
> ... which happens when after a different cpu hit the above ...
> }
>
> So all three of these conditions are pretty darn unlikely, with the
> exception of the first one that happens all the time during early boot
> before the RNG is initialized, after which it is static-branched out and
> never triggers again. So as far as /locks/ are concerned, things should
> be good here.
>
> However, in order to operate on per-cpu data, and therefore be lockless
> most of the time, it does take a "local lock", which is basically just
> disabling interrupts on non-RT to do a short operation:
>
> local_lock_irqsave(&crngs.lock, flags);
> crng = raw_cpu_ptr(&crngs);
> crng_fast_key_erasure(...);
> local_unlock_irqrestore(&crngs.lock, flags);
>
> crng_fast_key_erasure(), in turn, computes a single block of chacha20,
> which should be relatively fast. So the critical section is very short
> there.
>
> The reason that's local_lock_irqsave() rather than local_lock() (which
> would only disable preemption, I believe), is because IRQ handlers are
> supposed to be able to have access to random bytes too. It seems like it
> wouldn't be a super nice thing to remove that capability.
>
> It might be possible to double the amount of per-cpu data and have a
> separate state for IRQ than for non-IRQ, but that seems kind of wasteful
> and complex/hairy to implement.
>
> So that leads me to wonder more about the context: why does this matter?
> It looks like you're hitting this from a DO_ONCE() thing, which are
> usually only hit, as the name says, once, and then incur the overhead of
> firing off a worker to change the once-static-branch, which means
> DO_ONCE()es aren't very fast anyway? Or does that not accurately reflect
> what's happening?
>
> I'll also CC Sebastian here, who worked with me on that local lock and
> might have some insights on IRQ latency as well.
Sorry Jason, it seems I forgot to CC you on the tentative patch I sent
earlier today
https://patchwork.kernel.org/project/netdevbpf/patch/20221001205102.2319658-1-eric.dumazet@gmail.com/
Powered by blists - more mailing lists