[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMj1kXG4ELUdxGMp0+tNiV+7ygfHbe-u_pGPMTVj6UNdzQ+sDQ@mail.gmail.com>
Date: Wed, 26 Nov 2025 23:58:40 +0100
From: Ard Biesheuvel <ardb@...nel.org>
To: Ryan Roberts <ryan.roberts@....com>
Cc: Kees Cook <kees@...nel.org>, Will Deacon <will@...nel.org>, Arnd Bergmann <arnd@...db.de>,
Jeremy Linton <jeremy.linton@....com>, Catalin Marinas <Catalin.Marinas@....com>,
Mark Rutland <mark.rutland@....com>,
"linux-arm-kernel@...ts.infradead.org" <linux-arm-kernel@...ts.infradead.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [DISCUSSION] kstack offset randomization: bugs and performance
On Tue, 25 Nov 2025 at 12:15, Ryan Roberts <ryan.roberts@....com> wrote:
>
> On 24/11/2025 20:51, Kees Cook wrote:
> > On Mon, Nov 24, 2025 at 05:50:14PM +0000, Ryan Roberts wrote:
> >> On 24/11/2025 17:11, Kees Cook wrote:
> >>>
> >>>
> >>> On November 24, 2025 6:36:25 AM PST, Will Deacon <will@...nel.org> wrote:
> >>>> On Mon, Nov 17, 2025 at 11:31:22AM +0000, Ryan Roberts wrote:
> >>>>> On 17/11/2025 11:30, Ryan Roberts wrote:
> >>>>>> Could this give us a middle ground between strong-crng and
> >>>>>> weak-timestamp-counter? Perhaps the main issue is that we need to store the
> >>>>>> secret key for a long period?
> >>>>>>
> >>>>>>
> >>>>>> Anyway, I plan to work up a series with the bugfixes and performance
> >>>>>> improvements. I'll add the siphash approach as an experimental addition and get
> >>>>>> some more detailed numbers for all the options. But wanted to raise it all here
> >>>>>> first to get any early feedback.
> >>>>
> >>>> FWIW, I share Mark's concerns about using a counter for this. Given that
> >>>> the feature currently appears to be both slow _and_ broken I'd vote for
> >>>> either removing it or switching over to per-thread offsets as a first
> >>>> step.
> >>>
> >>> That it has potential weaknesses doesn't mean it should be entirely removed.
> >>>
> >>>> We already have a per-task stack canary with
> >>>> CONFIG_STACKPROTECTOR_PER_TASK so I don't understand the reluctance to
> >>>> do something similar here.
> >>>
> >>> That's not a reasonable comparison: the stack canary cannot change arbitrarily for a task or it would immediately crash on a function return. :)
> >>>
> >>>> Speeding up the crypto feels like something that could happen separately.
> >>>
> >>> Sure. But let's see what Ryan's patches look like. The suggested changes sound good to me.
> >>
> >> Just to say I haven't forgotten about this; I ended up having to switch to
> >> something more urgent. Hoping to get back to it later this week. I don't think
> >> this is an urgent issue, so hopefully folks are ok waiting.
> >>
> >> I propose to post whatever I end up with then we can all disscuss from there.
> >> But the rough shape so far:
> >>
> >> Fixes:
> >> - Remove choose_random_kstack_offset()
> >> - arch passes random into add_random_kstack_offset() (fixes migration bypass)
> >> - Move add_random_kstack_offset() to el0_svc()/el0_svc_compat() (before
> >> enabling interrupts) to fix non-preemption requirement (arm64)
> >
> > I thought we'd keep choose_random_kstack_offset() and just move
> > everything into a per-task location? (And for arm64 only)
>
> Err... I thought you were the one arguing against per-task state? I'm not really
> keen on having arm64 do a completely different thing to everyone else; It seems
> reasonable to me that we only need to (continue to) abstract the random source
> per-arch and the rest should remain common?
>
> Per my previous mails, I'm not really sure what choose_random_kstack_offset() is
> giving us in practice. Why not simplify?
>
It seems to me that arm64 inherited a lot of complexity from other
architectures that rely on a counter. AIUI, the 'non-preemption
requirement' and the 'migration bypass' issues are both a result of
the fact that the syscall duration is relied upon to provide more
pseudo-randomness, which is recorded in a per-CPU counter when the
syscall completes and carried over to the next invocation. (I don't
really buy the idea that having the next offset in memory for a
shorter time buys us anything in terms of robustness tbh)
I have some patches that make get_random_uXX()'s fast path lockless
that I will send out shortly (assuming no hate mail from the bots in
the mean time), which should make it suitable for general use as the
entropy source for kstack randomization on all architectures, modulo
the tail latency issue, but I'm not sure I understand why that is a
problem to begin with if it occurs sufficiently rarely. Is that a
PREEMPT_RT issue? Would it be better if the refill of the per-CPU
batched entropy buffers was relegated to some kind of kthread so it
can be scheduled independently? (Those buffers are all the same size
so we could easily keep a few hot spares)
TL;DR I agree with Ryan but I'd like to fix it across the board, i.e.,
get rid of the per-CPU variable entirely and all the dodgy uses of
counters.
Powered by blists - more mailing lists