linux-kernel - Re: [PATCH 5/5] random: Defer processing of randomness on PREEMPT

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHmME9q2Yid56ZZ9sBQWjEWEK2B06g3H9KYRwWqExXRoCdbPdA@mail.gmail.com>
Date:   Tue, 7 Dec 2021 19:14:16 +0100
From:   "Jason A. Donenfeld" <Jason@...c4.com>
To:     Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        "Theodore Ts'o" <tytso@....edu>,
        Thomas Gleixner <tglx@...utronix.de>,
        Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH 5/5] random: Defer processing of randomness on PREEMPT_RT.

Hi Sebastian,

Thanks for this series.

> --- a/kernel/irq/manage.c
> +++ b/kernel/irq/manage.c
> @@ -1281,6 +1281,9 @@ static int irq_thread(void *data)
>                 if (action_ret == IRQ_WAKE_THREAD)
>                         irq_wake_secondary(desc, action);
>
> +               if (IS_ENABLED(CONFIG_PREEMPT_RT))
> +                       process_interrupt_randomness();
> +

Adding a path from irq_thread() (and complicating the callgraph)
strikes me as a rather large hammer to solve this problem with. Maybe
it's the only way. But I wonder:

> on the lock if contended. The extraction of entropy (extract_buf())
> needs a few cycles more because it performs additionally few
> SHA1 transformations. This takes around 5-10us on a testing box (E5-2650
> 32 Cores, 2way NUMA) and is negligible.
> The frequent invocation of the IOCTLs RNDADDTOENTCNT and RNDRESEEDCRNG
> on multiple CPUs in parallel leads to filling and depletion of the pool
> which in turn results in heavy contention on the lock. The spinning with
> disabled interrupts on multiple CPUs leads to latencies of at least
> 100us on the same machine which is no longer acceptable.

I wonder if this problem would partially go away if, instead, I can
figure out how to make extract_buf() faster? I'd like to replace sha1
in there with something else anyway. I'm not sure what sorts of
speedups I could get, but perhaps something could be eked out. Would
this be viable? Or do you think that even with various speedups the
problem would still exist in one way or another?

Alternatively, is there a different locking scheme that would
prioritize the irq path mostly winning and having the ioctl path
spinning instead?

Also, just curious, what is running RNDRESEEDCRNG so much on a
PREEMPT_RT system and why?

Jason