[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ecc11ec7-ba00-41f1-8a2a-8f3a83c9ffd9@arm.com>
Date: Mon, 24 Nov 2025 17:50:14 +0000
From: Ryan Roberts <ryan.roberts@....com>
To: Kees Cook <kees@...nel.org>, Will Deacon <will@...nel.org>
Cc: Arnd Bergmann <arnd@...db.de>, Ard Biesheuvel <ardb@...nel.org>,
Jeremy Linton <jeremy.linton@....com>,
Catalin Marinas <Catalin.Marinas@....com>,
Mark Rutland <mark.rutland@....com>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [DISCUSSION] kstack offset randomization: bugs and performance
On 24/11/2025 17:11, Kees Cook wrote:
>
>
> On November 24, 2025 6:36:25 AM PST, Will Deacon <will@...nel.org> wrote:
>> On Mon, Nov 17, 2025 at 11:31:22AM +0000, Ryan Roberts wrote:
>>> On 17/11/2025 11:30, Ryan Roberts wrote:
>>>> Could this give us a middle ground between strong-crng and
>>>> weak-timestamp-counter? Perhaps the main issue is that we need to store the
>>>> secret key for a long period?
>>>>
>>>>
>>>> Anyway, I plan to work up a series with the bugfixes and performance
>>>> improvements. I'll add the siphash approach as an experimental addition and get
>>>> some more detailed numbers for all the options. But wanted to raise it all here
>>>> first to get any early feedback.
>>
>> FWIW, I share Mark's concerns about using a counter for this. Given that
>> the feature currently appears to be both slow _and_ broken I'd vote for
>> either removing it or switching over to per-thread offsets as a first
>> step.
>
> That it has potential weaknesses doesn't mean it should be entirely removed.
>
>> We already have a per-task stack canary with
>> CONFIG_STACKPROTECTOR_PER_TASK so I don't understand the reluctance to
>> do something similar here.
>
> That's not a reasonable comparison: the stack canary cannot change arbitrarily for a task or it would immediately crash on a function return. :)
>
>> Speeding up the crypto feels like something that could happen separately.
>
> Sure. But let's see what Ryan's patches look like. The suggested changes sound good to me.
Just to say I haven't forgotten about this; I ended up having to switch to
something more urgent. Hoping to get back to it later this week. I don't think
this is an urgent issue, so hopefully folks are ok waiting.
I propose to post whatever I end up with then we can all disscuss from there.
But the rough shape so far:
Fixes:
- Remove choose_random_kstack_offset()
- arch passes random into add_random_kstack_offset() (fixes migration bypass)
- Move add_random_kstack_offset() to el0_svc()/el0_svc_compat() (before
enabling interrupts) to fix non-preemption requirement (arm64)
Perf Improvements:
- Based on Jeremy's prng, but buffer the 32 bits and use 6 bits per syscall (so
cost of prng generation is amortized over 5 syscalls)
- Reseed prng using get_random_u64() every 64K prng invocations (so cost of
get_random_u64() is amortized over 64K*5 syscalls)
- So while get_random_u64() still has a latency spike, it's so infrequent that
it doesn't show up in p99.9 for my benchmarks.
- If we want to change it to per-task, I think it's all amenable.
- I'll leave the timer off limits for arm64.
Although I'm seeing some inconsistencies in the performance measurements, so
need to get that understood properly first.
Thanks,
Ryan
>
> -Kees
>
>
Powered by blists - more mailing lists