[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZjU4ganRF1Cbiug6@krava>
Date: Fri, 3 May 2024 21:18:25 +0200
From: Jiri Olsa <olsajiri@...il.com>
To: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
Cc: "olsajiri@...il.com" <olsajiri@...il.com>,
"peterz@...radead.org" <peterz@...radead.org>,
"songliubraving@...com" <songliubraving@...com>,
"luto@...nel.org" <luto@...nel.org>,
"mhiramat@...nel.org" <mhiramat@...nel.org>,
"andrii@...nel.org" <andrii@...nel.org>,
"linux-api@...r.kernel.org" <linux-api@...r.kernel.org>,
"john.fastabend@...il.com" <john.fastabend@...il.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"mingo@...hat.com" <mingo@...hat.com>,
"rostedt@...dmis.org" <rostedt@...dmis.org>,
"ast@...nel.org" <ast@...nel.org>,
"tglx@...utronix.de" <tglx@...utronix.de>,
"yhs@...com" <yhs@...com>,
"linux-man@...r.kernel.org" <linux-man@...r.kernel.org>,
"oleg@...hat.com" <oleg@...hat.com>,
"daniel@...earbox.net" <daniel@...earbox.net>,
"linux-trace-kernel@...r.kernel.org" <linux-trace-kernel@...r.kernel.org>,
"bpf@...r.kernel.org" <bpf@...r.kernel.org>,
"bp@...en8.de" <bp@...en8.de>, "x86@...nel.org" <x86@...nel.org>
Subject: Re: [PATCHv4 bpf-next 2/7] uprobe: Add uretprobe syscall to speed up
return probe
On Fri, May 03, 2024 at 03:53:15PM +0000, Edgecombe, Rick P wrote:
> On Fri, 2024-05-03 at 15:04 +0200, Jiri Olsa wrote:
> > On Fri, May 03, 2024 at 01:34:53PM +0200, Peter Zijlstra wrote:
> > > On Thu, May 02, 2024 at 02:23:08PM +0200, Jiri Olsa wrote:
> > > > Adding uretprobe syscall instead of trap to speed up return probe.
> > > >
> > > > At the moment the uretprobe setup/path is:
> > > >
> > > > - install entry uprobe
> > > >
> > > > - when the uprobe is hit, it overwrites probed function's return
> > > > address
> > > > on stack with address of the trampoline that contains breakpoint
> > > > instruction
> > > >
> > > > - the breakpoint trap code handles the uretprobe consumers execution
> > > > and
> > > > jumps back to original return address
>
> Hi,
>
> I worked on the x86 shadow stack support.
>
> I didn't know uprobes did anything like this. In hindsight I should have looked
> more closely. The current upstream behavior is to overwrite the return address
> on the stack?
>
> Stupid uprobes question - what is actually overwriting the return address on the
> stack? Is it the kernel? If so perhaps the kernel could just update the shadow
> stack at the same time.
yes, it's in kernel - arch_uretprobe_hijack_return_addr .. so I guess
we need to update the shadow stack with the new return value as well
>
> > > >
> > > > This patch replaces the above trampoline's breakpoint instruction with new
> > > > ureprobe syscall call. This syscall does exactly the same job as the trap
> > > > with some more extra work:
> > > >
> > > > - syscall trampoline must save original value for rax/r11/rcx registers
> > > > on stack - rax is set to syscall number and r11/rcx are changed and
> > > > used by syscall instruction
> > > >
> > > > - the syscall code reads the original values of those registers and
> > > > restore those values in task's pt_regs area
> > > >
> > > > - only caller from trampoline exposed in '[uprobes]' is allowed,
> > > > the process will receive SIGILL signal otherwise
> > > >
> > >
> > > Did you consider shadow stacks? IIRC we currently have userspace shadow
> > > stack support available, and that will utterly break all of this.
> >
> > nope.. I guess it's the extra ret instruction in the trampoline that would
> > make it crash?
>
> The original behavior seems problematic for shadow stack IIUC. I'm not sure of
> the additional breakage with the new behavior.
I can see it's broken also for current uprobes
>
> Roughly, how shadow stack works is there is an additional protected stack for
> the app thread. The HW pushes to from the shadow stack with CALL, and pops from
> it with RET. But it also continues to push and pop from the normal stack. On
> pop, if the values don't match between the two stacks, an exception is
> generated. The whole point is to prevent the app from overwriting its stack
> return address to return to random places.
>
> Userspace cannot (normally) write to the shadow stack, but the kernel can do
> this or adust the SSP (shadow stack pointer). So in the kernel (for things like
> sigreturn) there is an ability to do what is needed. Ptracers also can do things
> like this.
hack below seems to fix it for the current uprobe setup,
we need similar fix for the uretprobe syscall trampoline setup
jirka
---
diff --git a/arch/x86/include/asm/shstk.h b/arch/x86/include/asm/shstk.h
index 42fee8959df7..99a0948a3b79 100644
--- a/arch/x86/include/asm/shstk.h
+++ b/arch/x86/include/asm/shstk.h
@@ -21,6 +21,7 @@ unsigned long shstk_alloc_thread_stack(struct task_struct *p, unsigned long clon
void shstk_free(struct task_struct *p);
int setup_signal_shadow_stack(struct ksignal *ksig);
int restore_signal_shadow_stack(void);
+void uprobe_change_stack(unsigned long addr);
#else
static inline long shstk_prctl(struct task_struct *task, int option,
unsigned long arg2) { return -EINVAL; }
diff --git a/arch/x86/kernel/shstk.c b/arch/x86/kernel/shstk.c
index 59e15dd8d0f8..d2c4dbe5843c 100644
--- a/arch/x86/kernel/shstk.c
+++ b/arch/x86/kernel/shstk.c
@@ -577,3 +577,11 @@ long shstk_prctl(struct task_struct *task, int option, unsigned long arg2)
return wrss_control(true);
return -EINVAL;
}
+
+void uprobe_change_stack(unsigned long addr)
+{
+ unsigned long ssp;
+
+ ssp = get_user_shstk_addr();
+ write_user_shstk_64((u64 __user *)ssp, (u64)addr);
+}
diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
index 81e6ee95784d..88afbeaacb8f 100644
--- a/arch/x86/kernel/uprobes.c
+++ b/arch/x86/kernel/uprobes.c
@@ -348,7 +348,7 @@ void *arch_uprobe_trampoline(unsigned long *psize)
* only for native 64-bit process, the compat process still uses
* standard breakpoint.
*/
- if (user_64bit_mode(regs)) {
+ if (0 && user_64bit_mode(regs)) {
*psize = uretprobe_syscall_end - uretprobe_syscall_entry;
return uretprobe_syscall_entry;
}
@@ -1191,8 +1191,10 @@ arch_uretprobe_hijack_return_addr(unsigned long trampoline_vaddr, struct pt_regs
return orig_ret_vaddr;
nleft = copy_to_user((void __user *)regs->sp, &trampoline_vaddr, rasize);
- if (likely(!nleft))
+ if (likely(!nleft)) {
+ uprobe_change_stack(trampoline_vaddr);
return orig_ret_vaddr;
+ }
if (nleft != rasize) {
pr_err("return address clobbered: pid=%d, %%sp=%#lx, %%ip=%#lx\n",
Powered by blists - more mailing lists