lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Sat, 1 Oct 2022 15:31:15 -0700 From: Eric Dumazet <edumazet@...gle.com> To: "Jason A. Donenfeld" <Jason@...c4.com> Cc: Christophe Leroy <christophe.leroy@...roup.eu>, "netdev@...r.kernel.org" <netdev@...r.kernel.org>, David Dworken <ddworken@...gle.com>, Willem de Bruijn <willemb@...gle.com>, "David S. Miller" <davem@...emloft.net>, Sebastian Andrzej Siewior <bigeasy@...utronix.de> Subject: Re: 126 ms irqsoff Latency - Possibly due to commit 190cc82489f4 ("tcp: change source port randomizarion at connect() time") On Sat, Oct 1, 2022 at 3:16 PM Jason A. Donenfeld <Jason@...c4.com> wrote: > > (CC+Sebastian) > > Hi Eric, Christophe, > > I'm trying to understand the context of this and whether/why there's a > problem. Some overview on how get_random_bytes() works: > > Most of the time, get_random_bytes() is completely lockless and operates > over per-CPU data structures. get_random_bytes() calls > _get_random_bytes(), which calls crng_make_state(), and then operates > over stack data to churn out some random bytes. crng_make_state() is > where all the meat happens. > > In crng_make_state(), there are three unlikely conditionals where locks > are taken. The first is: > > if (!crng_ready()) { > ... do some expensive things involving locks ... > ... but only during early boot before the rng is initialized ... > } > > The second one is: > > if (unlikely(time_is_before_jiffies(READ_ONCE(base_crng.birth) + crng_reseed_interval()))) { > ... do something less expensive involving locks ... > ... which happens approximately once per minute ... > } > > The third one is: > > if (unlikely(crng->generation != READ_ONCE(base_crng.generation))) { > ... do something even less expensive involving locks ... > ... which happens when after a different cpu hit the above ... > } > > So all three of these conditions are pretty darn unlikely, with the > exception of the first one that happens all the time during early boot > before the RNG is initialized, after which it is static-branched out and > never triggers again. So as far as /locks/ are concerned, things should > be good here. > > However, in order to operate on per-cpu data, and therefore be lockless > most of the time, it does take a "local lock", which is basically just > disabling interrupts on non-RT to do a short operation: > > local_lock_irqsave(&crngs.lock, flags); > crng = raw_cpu_ptr(&crngs); > crng_fast_key_erasure(...); > local_unlock_irqrestore(&crngs.lock, flags); > > crng_fast_key_erasure(), in turn, computes a single block of chacha20, > which should be relatively fast. So the critical section is very short > there. > > The reason that's local_lock_irqsave() rather than local_lock() (which > would only disable preemption, I believe), is because IRQ handlers are > supposed to be able to have access to random bytes too. It seems like it > wouldn't be a super nice thing to remove that capability. > > It might be possible to double the amount of per-cpu data and have a > separate state for IRQ than for non-IRQ, but that seems kind of wasteful > and complex/hairy to implement. > > So that leads me to wonder more about the context: why does this matter? > It looks like you're hitting this from a DO_ONCE() thing, which are > usually only hit, as the name says, once, and then incur the overhead of > firing off a worker to change the once-static-branch, which means > DO_ONCE()es aren't very fast anyway? Or does that not accurately reflect > what's happening? > > I'll also CC Sebastian here, who worked with me on that local lock and > might have some insights on IRQ latency as well. Sorry Jason, it seems I forgot to CC you on the tentative patch I sent earlier today https://patchwork.kernel.org/project/netdevbpf/patch/20221001205102.2319658-1-eric.dumazet@gmail.com/
Powered by blists - more mailing lists