[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20241118171808.316ae124cd57886e813cb98f@kernel.org>
Date: Mon, 18 Nov 2024 17:18:08 +0900
From: Masami Hiramatsu (Google) <mhiramat@...nel.org>
To: Jiri Olsa <jolsa@...nel.org>
Cc: Oleg Nesterov <oleg@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
Andrii Nakryiko <andrii@...nel.org>, bpf@...r.kernel.org, Song Liu
<songliubraving@...com>, Yonghong Song <yhs@...com>, John Fastabend
<john.fastabend@...il.com>, Hao Luo <haoluo@...gle.com>, Steven Rostedt
<rostedt@...dmis.org>, Masami Hiramatsu <mhiramat@...nel.org>, Alan Maguire
<alan.maguire@...cle.com>, linux-kernel@...r.kernel.org,
linux-trace-kernel@...r.kernel.org
Subject: Re: [RFC perf/core 07/11] uprobes/x86: Add support to optimize
uprobes
On Tue, 5 Nov 2024 14:34:01 +0100
Jiri Olsa <jolsa@...nel.org> wrote:
> Putting together all the previously added pieces to support optimized
> uprobes on top of 5-byte nop instruction.
>
> The current uprobe execution goes through following:
> - installs breakpoint instruction over original instruction
> - exception handler hit and calls related uprobe consumers
> - and either simulates original instruction or does out of line single step
> execution of it
> - returns to user space
>
> The optimized uprobe path
>
> - checks the original instruction is 5-byte nop (plus other checks)
> - adds (or uses existing) user space trampoline and overwrites original
> instruction (5-byte nop) with call to user space trampoline
> - the user space trampoline executes uprobe syscall that calls related uprobe
> consumers
> - trampoline returns back to next instruction
>
> This approach won't speed up all uprobes as it's limited to using nop5 as
> original instruction, but we could use nop5 as USDT probe instruction (which
> uses single byte nop ATM) and speed up the USDT probes.
>
> This patch overloads related arch functions in uprobe_write_opcode and
> set_orig_insn so they can install call instruction if needed.
>
> The arch_uprobe_optimize triggers the uprobe optimization and is called after
> first uprobe hit. I originally had it called on uprobe installation but then
> it clashed with elf loader, because the user space trampoline was added in a
> place where loader might need to put elf segments, so I decided to do it after
> first uprobe hit when loading is done.
>
> TODO release uprobe trampoline when it's no longer needed.. we might need to
> stop all cpus to make sure no user space thread is in the trampoline.. or we
> might just keep it, because there's just one 4GB memory region?
>
> Signed-off-by: Jiri Olsa <jolsa@...nel.org>
> ---
> arch/x86/include/asm/uprobes.h | 7 ++
> arch/x86/kernel/uprobes.c | 130 +++++++++++++++++++++++++++++++++
> include/linux/uprobes.h | 1 +
> kernel/events/uprobes.c | 3 +
> 4 files changed, 141 insertions(+)
>
> diff --git a/arch/x86/include/asm/uprobes.h b/arch/x86/include/asm/uprobes.h
> index 678fb546f0a7..84a75ed748f0 100644
> --- a/arch/x86/include/asm/uprobes.h
> +++ b/arch/x86/include/asm/uprobes.h
> @@ -20,6 +20,11 @@ typedef u8 uprobe_opcode_t;
> #define UPROBE_SWBP_INSN 0xcc
> #define UPROBE_SWBP_INSN_SIZE 1
>
> +enum {
> + ARCH_UPROBE_FLAG_CAN_OPTIMIZE = 0,
> + ARCH_UPROBE_FLAG_OPTIMIZED = 1,
> +};
> +
> struct uprobe_xol_ops;
>
> struct arch_uprobe {
> @@ -45,6 +50,8 @@ struct arch_uprobe {
> u8 ilen;
> } push;
> };
> +
> + unsigned long flags;
> };
>
> struct arch_uprobe_task {
> diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
> index 02aa4519b677..50ccf24ff42c 100644
> --- a/arch/x86/kernel/uprobes.c
> +++ b/arch/x86/kernel/uprobes.c
> @@ -18,6 +18,7 @@
> #include <asm/processor.h>
> #include <asm/insn.h>
> #include <asm/mmu_context.h>
> +#include <asm/nops.h>
>
> /* Post-execution fixups. */
>
> @@ -877,6 +878,33 @@ static const struct uprobe_xol_ops push_xol_ops = {
> .emulate = push_emulate_op,
> };
>
> +static int is_nop5_insns(uprobe_opcode_t *insn)
> +{
> + return !memcmp(insn, x86_nops[5], 5);
Maybe better to use BYTES_NOP5 directly?
> +}
> +
> +static int is_call_insns(uprobe_opcode_t *insn)
> +{
> + return *insn == 0xe8;
0xe8 -> CALL_INSN_OPCODE
Thank you,
--
Masami Hiramatsu (Google) <mhiramat@...nel.org>
Powered by blists - more mailing lists