lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Sun, 2 Oct 2022 00:16:17 +0200 From: "Jason A. Donenfeld" <Jason@...c4.com> To: Eric Dumazet <edumazet@...gle.com> Cc: Christophe Leroy <christophe.leroy@...roup.eu>, "netdev@...r.kernel.org" <netdev@...r.kernel.org>, David Dworken <ddworken@...gle.com>, Willem de Bruijn <willemb@...gle.com>, "David S. Miller" <davem@...emloft.net>, bigeasy@...utronix.de Subject: Re: 126 ms irqsoff Latency - Possibly due to commit 190cc82489f4 ("tcp: change source port randomizarion at connect() time") (CC+Sebastian) Hi Eric, Christophe, I'm trying to understand the context of this and whether/why there's a problem. Some overview on how get_random_bytes() works: Most of the time, get_random_bytes() is completely lockless and operates over per-CPU data structures. get_random_bytes() calls _get_random_bytes(), which calls crng_make_state(), and then operates over stack data to churn out some random bytes. crng_make_state() is where all the meat happens. In crng_make_state(), there are three unlikely conditionals where locks are taken. The first is: if (!crng_ready()) { ... do some expensive things involving locks ... ... but only during early boot before the rng is initialized ... } The second one is: if (unlikely(time_is_before_jiffies(READ_ONCE(base_crng.birth) + crng_reseed_interval()))) { ... do something less expensive involving locks ... ... which happens approximately once per minute ... } The third one is: if (unlikely(crng->generation != READ_ONCE(base_crng.generation))) { ... do something even less expensive involving locks ... ... which happens when after a different cpu hit the above ... } So all three of these conditions are pretty darn unlikely, with the exception of the first one that happens all the time during early boot before the RNG is initialized, after which it is static-branched out and never triggers again. So as far as /locks/ are concerned, things should be good here. However, in order to operate on per-cpu data, and therefore be lockless most of the time, it does take a "local lock", which is basically just disabling interrupts on non-RT to do a short operation: local_lock_irqsave(&crngs.lock, flags); crng = raw_cpu_ptr(&crngs); crng_fast_key_erasure(...); local_unlock_irqrestore(&crngs.lock, flags); crng_fast_key_erasure(), in turn, computes a single block of chacha20, which should be relatively fast. So the critical section is very short there. The reason that's local_lock_irqsave() rather than local_lock() (which would only disable preemption, I believe), is because IRQ handlers are supposed to be able to have access to random bytes too. It seems like it wouldn't be a super nice thing to remove that capability. It might be possible to double the amount of per-cpu data and have a separate state for IRQ than for non-IRQ, but that seems kind of wasteful and complex/hairy to implement. So that leads me to wonder more about the context: why does this matter? It looks like you're hitting this from a DO_ONCE() thing, which are usually only hit, as the name says, once, and then incur the overhead of firing off a worker to change the once-static-branch, which means DO_ONCE()es aren't very fast anyway? Or does that not accurately reflect what's happening? I'll also CC Sebastian here, who worked with me on that local lock and might have some insights on IRQ latency as well. Regards, Jason
Powered by blists - more mailing lists