lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <202502081235.5A6F352985@keescook>
Date: Sat, 8 Feb 2025 12:35:43 -0800
From: Kees Cook <kees@...nel.org>
To: Jiri Olsa <olsajiri@...il.com>
Cc: Jann Horn <jannh@...gle.com>, Eyal Birger <eyal.birger@...il.com>,
	luto@...capital.net, wad@...omium.org, oleg@...hat.com,
	mhiramat@...nel.org, andrii@...nel.org,
	alexei.starovoitov@...il.com, cyphar@...har.com,
	songliubraving@...com, yhs@...com, john.fastabend@...il.com,
	peterz@...radead.org, tglx@...utronix.de, bp@...en8.de,
	daniel@...earbox.net, ast@...nel.org, andrii.nakryiko@...il.com,
	rostedt@...dmis.org, rafi@....io, shmulik.ladkani@...il.com,
	bpf@...r.kernel.org, linux-api@...r.kernel.org,
	linux-trace-kernel@...r.kernel.org, x86@...nel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 0/2] seccomp: pass uretprobe system call through
 seccomp

On Sat, Feb 08, 2025 at 01:03:55AM +0100, Jiri Olsa wrote:
> On Fri, Feb 07, 2025 at 04:27:09PM +0100, Jann Horn wrote:
> > On Sun, Feb 2, 2025 at 5:29 PM Eyal Birger <eyal.birger@...il.com> wrote:
> > > uretprobe(2) is an performance enhancement system call added to improve
> > > uretprobes on x86_64.
> > >
> > > Confinement environments such as Docker are not aware of this new system
> > > call and kill confined processes when uretprobes are attached to them.
> > 
> > FYI, you might have similar issues with Syscall User Dispatch
> > (https://docs.kernel.org/admin-guide/syscall-user-dispatch.html) and
> > potentially also with ptrace-based sandboxes, depending on what kinda
> > processes you inject uprobes into. For Syscall User Dispatch, there is
> > already precedent for a bypass based on instruction pointer (see
> > syscall_user_dispatch()).
> > 
> > > Since uretprobe is a "kernel implementation detail" system call which is
> > > not used by userspace application code directly, pass this system call
> > > through seccomp without forcing existing userspace confinement environments
> > > to be changed.
> > 
> > This makes me feel kinda uncomfortable. The purpose of seccomp() is
> > that you can create a process that is as locked down as you want; you
> > can use it for some light limits on what a process can do (like in
> > Docker), or you can use it to make a process that has access to
> > essentially nothing except read(), write() and exit_group(). Even
> > stuff like restart_syscall() and rt_sigreturn() is not currently
> > excepted from that.
> > 
> > I guess your usecase is a little special in that you were already
> > calling from userspace into the kernel with SWBP before, which is also
> > not subject to seccomp; and the syscall is essentially an
> > arch-specific hack to make the SWBP a little faster.
> > 
> > If we do this, we should at least ensure that there is absolutely no
> > way for anything to happen in sys_uretprobe when no uretprobes are
> > configured for the process - the first check in the syscall
> > implementation almost does that, but the implementation could be a bit
> > stricter. It checks for "regs->ip != trampoline_check_ip()", but if no
> > uprobe region exists for the process, trampoline_check_ip() returns
> > `-1 + (uretprobe_syscall_check - uretprobe_trampoline_entry)`. So
> > there is a userspace instruction pointer near the bottom of the
> > address space that is allowed to call into the syscall if uretprobes
> > are not set up. Though the mmap minimum address restrictions will
> > typically prevent creating mappings there, and
> > uprobe_handle_trampoline() will SIGILL us if we get that far without a
> > valid uretprobe.
> 
> nice catch, I think change below should fix that

Thanks! Please backport this to -stable too. :)

-Kees

> 
> thanks,
> jirka
> 
> 
> ---
> diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
> index 0c74a4d4df65..9b8837d8f06e 100644
> --- a/arch/x86/kernel/uprobes.c
> +++ b/arch/x86/kernel/uprobes.c
> @@ -368,19 +368,21 @@ void *arch_uretprobe_trampoline(unsigned long *psize)
>  	return &insn;
>  }
>  
> -static unsigned long trampoline_check_ip(void)
> +static unsigned long trampoline_check_ip(unsigned long tramp)
>  {
> -	unsigned long tramp = uprobe_get_trampoline_vaddr();
> -
>  	return tramp + (uretprobe_syscall_check - uretprobe_trampoline_entry);
>  }
>  
>  SYSCALL_DEFINE0(uretprobe)
>  {
>  	struct pt_regs *regs = task_pt_regs(current);
> -	unsigned long err, ip, sp, r11_cx_ax[3];
> +	unsigned long err, ip, sp, r11_cx_ax[3], tramp;
> +
> +	tramp = uprobe_get_trampoline_vaddr();
> +	if (tramp == -1)
> +		goto sigill;
>  
> -	if (regs->ip != trampoline_check_ip())
> +	if (regs->ip != trampoline_check_ip(tramp))
>  		goto sigill;
>  
>  	err = copy_from_user(r11_cx_ax, (void __user *)regs->sp, sizeof(r11_cx_ax));

-- 
Kees Cook

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ