[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6529fbe6e6af7b87fdc92912b0b6a8878796bb6e.camel@physik.fu-berlin.de>
Date: Sat, 17 Jan 2026 08:00:15 +0100
From: John Paul Adrian Glaubitz <glaubitz@...sik.fu-berlin.de>
To: Ludwig Rydberg <ludwig.rydberg@...sler.com>, davem@...emloft.net,
andreas@...sler.com, brauner@...nel.org, shuah@...nel.org
Cc: sparclinux@...r.kernel.org, linux-kselftest@...r.kernel.org,
linux-kernel@...r.kernel.org, arnd@...db.de, geert@...ux-m68k.org,
schuster.simon@...mens-energy.com, kernel@...rcher.dialup.fu-berlin.de
Subject: Re: [PATCH 1/3] sparc: Synchronize user stack on fork and clone
Hi,
On Sat, 2026-01-17 at 07:57 +0100, John Paul Adrian Glaubitz wrote:
> Hi Ludwig,
>
> On Fri, 2026-01-16 at 16:30 +0100, Ludwig Rydberg wrote:
> > From: Andreas Larsson <andreas@...sler.com>
> >
> > Flush all uncommitted user windows before calling the generic syscall
> > handlers for clone, fork, and vfork.
> >
> > Prior to entering the arch common handlers sparc_{clone|fork|vfork}, the
> > arch-specific syscall wrappers for these syscalls will attempt to flush
> > all windows (including user windows).
> >
> > In the window overflow trap handlers on both SPARC{32|64},
> > if the window can't be stored (i.e due to MMU related faults) the routine
> > backups the user window and increments a thread counter (wsaved).
> >
> > By adding a synchronization point after the flush attempt, when fault
> > handling is enabled, any uncommitted user windows will be flushed.
> >
> > Link: https://sourceware.org/bugzilla/show_bug.cgi?id=31394
> > Closes: https://lore.kernel.org/sparclinux/fe5cc47167430007560501aabb28ba154985b661.camel@physik.fu-berlin.de/
> > Signed-off-by: Andreas Larsson <andreas@...sler.com>
> > Signed-off-by: Ludwig Rydberg <ludwig.rydberg@...sler.com>
> > ---
> > arch/sparc/kernel/process.c | 38 +++++++++++++++++++++++--------------
> > 1 file changed, 24 insertions(+), 14 deletions(-)
> >
> > diff --git a/arch/sparc/kernel/process.c b/arch/sparc/kernel/process.c
> > index 0442ab00518d..7d69877511fa 100644
> > --- a/arch/sparc/kernel/process.c
> > +++ b/arch/sparc/kernel/process.c
> > @@ -17,14 +17,18 @@
> >
> > asmlinkage long sparc_fork(struct pt_regs *regs)
> > {
> > - unsigned long orig_i1 = regs->u_regs[UREG_I1];
> > + unsigned long orig_i1;
> > long ret;
> > struct kernel_clone_args args = {
> > .exit_signal = SIGCHLD,
> > - /* Reuse the parent's stack for the child. */
> > - .stack = regs->u_regs[UREG_FP],
> > };
> >
> > + synchronize_user_stack();
> > +
> > + orig_i1 = regs->u_regs[UREG_I1];
> > + /* Reuse the parent's stack for the child. */
> > + args.stack = regs->u_regs[UREG_FP];
> > +
> > ret = kernel_clone(&args);
> >
> > /* If we get an error and potentially restart the system
> > @@ -40,16 +44,19 @@ asmlinkage long sparc_fork(struct pt_regs *regs)
> >
> > asmlinkage long sparc_vfork(struct pt_regs *regs)
> > {
> > - unsigned long orig_i1 = regs->u_regs[UREG_I1];
> > + unsigned long orig_i1;
> > long ret;
> > -
> > struct kernel_clone_args args = {
> > .flags = CLONE_VFORK | CLONE_VM,
> > .exit_signal = SIGCHLD,
> > - /* Reuse the parent's stack for the child. */
> > - .stack = regs->u_regs[UREG_FP],
> > };
> >
> > + synchronize_user_stack();
> > +
> > + orig_i1 = regs->u_regs[UREG_I1];
> > + /* Reuse the parent's stack for the child. */
> > + args.stack = regs->u_regs[UREG_FP];
> > +
> > ret = kernel_clone(&args);
> >
> > /* If we get an error and potentially restart the system
> > @@ -65,15 +72,18 @@ asmlinkage long sparc_vfork(struct pt_regs *regs)
> >
> > asmlinkage long sparc_clone(struct pt_regs *regs)
> > {
> > - unsigned long orig_i1 = regs->u_regs[UREG_I1];
> > - unsigned int flags = lower_32_bits(regs->u_regs[UREG_I0]);
> > + unsigned long orig_i1;
> > + unsigned int flags;
> > long ret;
> > + struct kernel_clone_args args = {0};
> >
> > - struct kernel_clone_args args = {
> > - .flags = (flags & ~CSIGNAL),
> > - .exit_signal = (flags & CSIGNAL),
> > - .tls = regs->u_regs[UREG_I3],
> > - };
> > + synchronize_user_stack();
> > +
> > + orig_i1 = regs->u_regs[UREG_I1];
> > + flags = lower_32_bits(regs->u_regs[UREG_I0]);
> > + args.flags = (flags & ~CSIGNAL);
> > + args.exit_signal = (flags & CSIGNAL);
> > + args.tls = regs->u_regs[UREG_I3];
> >
> > #ifdef CONFIG_COMPAT
> > if (test_thread_flag(TIF_32BIT)) {
>
> I have tested the patch with the following test program written by Michael Karcher
> on a Sun Netra 240 running kernel version 6.19-rc5 by applying the patch on top:
>
> glaubitz@...erin:~$ cat attack_on_the_clone.c
> // SPARC64 clone problem demonstration
> //
> // the sparc64 Linux kernel fails to execute clone if %sp points into uncommitted memory (e.g. due to lazy
> // stack committing). This program uses a variable length array on the stack to position the stack pointer when
> // invoking the library function clone just at a page boundary. The library function clone allocates a stack frame
> // that is completely in uncommitted memory before entering the kernel call clone.
>
> // to probe for the correct size of the VLA, a test function is called first. This function records the %fp value it
> // receives (which will be the %fp value in the library function clone, too, if the VLA size is equal)
>
> // (c) Michael Karcher (kernel@...rcher.dialup.fu-berlin.de) , 2024, GPLv2 or later
>
> #define _GNU_SOURCE
>
> #include <sys/mman.h>
> #include <sys/wait.h>
> #include <sched.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <stdint.h>
>
> #define SPARC64_STACK_BIAS 0x7FF
>
> typedef int fn_t(void*);
> typedef pid_t clone_t(fn_t* entry, void* stack, int flags, void* arg, ...);
>
>
> // very simple function invoked using clone
> int nop(void* bar)
> {
> return 0;
> }
>
>
> // clone substitute that records %fp
> uint64_t call_clone_sp;
>
> pid_t dummy_clone(fn_t* entry, void* stack, int flags, void* arg, ...)
> {
> register uint64_t frameptr asm("fp");
> call_clone_sp = frameptr + SPARC64_STACK_BIAS; // sp in call_clone is fp in dummy_clone / clone
> return -1;
> }
>
>
> // function to invoke clone with (im)properly aligned stack
> void* child_stack;
>
> int call_clone(int waste_qwords, clone_t* clonefn)
> {
> void* volatile waste[waste_qwords+2]; // volatile to not optimize the array away
> waste[waste_qwords+1] = NULL;
>
> pid_t child_pid = clonefn(nop,
> child_stack,
> CLONE_VM | SIGCHLD,
> 0);
> if (child_pid > 0)
> {
> pid_t waitresult = waitpid(child_pid, NULL, 0);
> // before fork-bombing anything if this doesn't go to plan, exit
> if (waitresult != child_pid) abort();
> return 0;
> }
> else
> {
> return -1;
> }
> }
>
> int main(void)
> {
> int wasteamount;
> child_stack = mmap(NULL, 16384, PROT_READ | PROT_WRITE, MAP_ANON | MAP_PRIVATE, -1, 0);
> call_clone(0, dummy_clone);
> printf("effective FP in clone() with waste 0 = %llx\n", call_clone_sp);
> wasteamount = 1024 + (call_clone_sp & 0xFFF) / 8;
> printf("this is %d 64-bit words above the page boundary at least 8K away\n", wasteamount);
> child_stack = (void*)((char*)child_stack + 16000);
> clone(NULL, NULL, 0, 0); // fails, but resolves "clone"
> // failes for wasteamount-22 to wasteamount+22 (only even values tested)
> if (call_clone(wasteamount, clone) < 0)
> {
> perror("clone");
> }
> else
> {
> puts("Congratulations, clone succeeded\n");
> }
> }
>
> glaubitz@...erin:~$ gcc -o attack_on_the_clone attack_on_the_clone.c
> glaubitz@...erin:~$
>
> Without the patch:
>
> glaubitz@...erin:~$ uname -a
> Linux raverin 6.19.0-rc5 #19 Sat Jan 17 06:32:58 UTC 2026 sparc64 GNU/Linux
> glaubitz@...erin:~$ ./attack_on_the_clone
> effective FP in clone() with waste 0 = 7feffe60de0
> this is 1468 64-bit words above the page boundary at least 8K away
> clone: Bad address
> glaubitz@...erin:~$
>
> With the patch:
>
> glaubitz@...erin:~$ uname -a
> Linux raverin 6.19.0-rc5+ #20 Sat Jan 17 06:40:52 UTC 2026 sparc64 GNU/Linux
> glaubitz@...erin:~$ ./attack_on_the_clone
> effective FP in clone() with waste 0 = 7fefffaede0
> this is 1468 64-bit words above the page boundary at least 8K away
> Congratulations, clone succeeded
>
> glaubitz@...erin:~$
>
> I can therefore confirm that this patch fixes the bug.
>
> Tested-by: John Paul Adrian Glaubitz <glaubitz@...sik.fu-berlin.de>
Forgot to mention: I reverted the workaround in glibc [1] for testing.
Adrian
> [1] https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=234458024300f0b4b430785999f33eddf059af6a
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer
`. `' Physicist
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
Powered by blists - more mailing lists