[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250714093903.GP905792@noisy.programming.kicks-ass.net>
Date: Mon, 14 Jul 2025 11:39:03 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Masami Hiramatsu <mhiramat@...nel.org>
Cc: Jiri Olsa <jolsa@...nel.org>, Oleg Nesterov <oleg@...hat.com>,
Andrii Nakryiko <andrii@...nel.org>, bpf@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
x86@...nel.org, Song Liu <songliubraving@...com>,
Yonghong Song <yhs@...com>,
John Fastabend <john.fastabend@...il.com>,
Hao Luo <haoluo@...gle.com>, Steven Rostedt <rostedt@...dmis.org>,
Alan Maguire <alan.maguire@...cle.com>,
David Laight <David.Laight@...lab.com>,
Thomas Weißschuh <thomas@...ch.de>,
Ingo Molnar <mingo@...nel.org>
Subject: Re: [PATCHv5 perf/core 09/22] uprobes/x86: Add uprobe syscall to
speed up uprobe
On Mon, Jul 14, 2025 at 05:39:15PM +0900, Masami Hiramatsu wrote:
> > + /*
> > + * Some of the uprobe consumers has changed sp, we can do nothing,
> > + * just return via iret.
> > + */
>
> Do we allow consumers to change the `sp`? It seems dangerous
> because consumer needs to know whether it is called from
> breakpoint or syscall. Note that it has to set up ax, r11
> and cx on the stack correctly only if it is called from syscall,
> that is not compatible with breakpoint mode.
>
> > + if (regs->sp != sp)
> > + return regs->ax;
>
> Shouldn't we recover regs->ip? Or in this case does consumer has
> to change ip (== return address from trampline) too?
>
> IMHO, it should not allow to change the `sp` and `ip` directly
> in syscall mode. In case of kprobes, kprobe jump optimization
> must be disabled explicitly (e.g. setting dummy post_handler)
> if the handler changes `ip`.
>
> Or, even if allowing to modify `sp` and `ip`, it should be helped
> by this function, e.g. stack up the dummy regs->ax/r11/cx on the
> new stack at the new `regs->sp`. This will allow modifying those
> registries transparently as same as breakpoint mode.
> In this case, I think we just need to remove above 2 lines.
There are two syscall return paths; the 'normal' is sysret and for that
you need to undo all things just right.
The other is IRET. At which point we can have whatever state we want,
including modified SP.
See arch/x86/entry/syscall_64.c:do_syscall_64() and
arch/x86/entry/entry_64.S:entry_SYSCALL_64
The IRET path should return pt_regs as is from an interrupt/exception
very much like INT3.
Powered by blists - more mailing lists