[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHmME9oXSx4JS0ZJeZTb7VC3gXoackuH389V9FDknHf_-rDJyA@mail.gmail.com>
Date: Sun, 30 Jan 2022 23:55:09 +0100
From: "Jason A. Donenfeld" <Jason@...c4.com>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc: LKML <linux-kernel@...r.kernel.org>,
"Theodore Ts'o" <tytso@....edu>,
Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>,
Herbert Xu <herbert@...dor.apana.org.au>
Subject: Re: [PATCH 5/5] random: Defer processing of randomness on PREEMPT_RT.
Hey Sebastian,
I spent the weekend thinking about this some more. I'm actually
warming up a bit to the general approach of the original solution
here, though still have questions. To summarize my understanding of
where we are:
Alternative solution we've been discussing:
- Replace spinlock_t with raw spinlocks.
- Ratelimit userspace-triggered latency inducing ioctls with
ratelimit() and an additional mutex of sorts.
- Result: pretty much the same structure we have now, but with some
added protection for PREEMPT_RT.
Your original solution:
- Absorb into the fast pool during the actual IRQ, but never dump it
into the main pool (nor fast load into the crng directly if
crng_init==0) from the hard irq.
- Instead, have irq_thread() check to see if the calling CPU's fast
pool is >= 64, and if so, dump it into the main pool (or fast load
into the crng directly if crng_init==0).
I have two questions about the implications of your original solution:
1) How often does irq_thread() run? With what we have now, we dump the
fast pool into the main pool at exactly 64 events. With what you're
proposing, we're now in >= 64 territory. How do we conceptualize how
far beyond 64 it's likely to grow before irq_thread() does something?
Is it easy to make guarantees like, "at most, probably around 17"? Or
is it potentially unbounded? Growing beyond isn't actually necessarily
a bad thing, but it could potentially *slow* the collection of
entropy. That probably matters more in the crng_init==0 mode, where
we're just desperate to get whatever we can as fast as we can. But
depending on how large that is, it could matter for the main case too.
Having some handle on the latency added here would be helpful for
thinking about this.
2) If we went with this solution, I think I'd prefer to actually do it
unconditionally, for PREEMPT_RT=y and PREEMPT_RT=n. It's easier to
track how this thing works if the state machine always works in one
way instead of two. It also makes thinking about performance margins
for the various components easier if there's only one way used. Do you
see any downsides in doing this unconditionally?
Regards,
Jason
Powered by blists - more mailing lists