[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <yt9dmsjyx067.fsf@linux.ibm.com>
Date: Mon, 23 Sep 2024 10:08:00 +0200
From: Sven Schnelle <svens@...ux.ibm.com>
To: "Masami Hiramatsu (Google)" <mhiramat@...nel.org>
Cc: Alexei Starovoitov <alexei.starovoitov@...il.com>,
Steven Rostedt
<rostedt@...dmis.org>,
Florent Revest <revest@...omium.org>,
linux-trace-kernel@...r.kernel.org,
LKML <linux-kernel@...r.kernel.org>,
Martin KaFai Lau <martin.lau@...ux.dev>, bpf <bpf@...r.kernel.org>,
Alexei Starovoitov <ast@...nel.org>, Jiri Olsa <jolsa@...nel.org>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Daniel Borkmann
<daniel@...earbox.net>,
Alan Maguire <alan.maguire@...cle.com>,
Mark
Rutland <mark.rutland@....com>,
Peter Zijlstra <peterz@...radead.org>,
Thomas Gleixner <tglx@...utronix.de>, Guo Ren <guoren@...nel.org>,
linux-arch@...r.kernel.org
Subject: Re: [PATCH v15 13/19] fprobe: Rewrite fprobe on function-graph tracer
"Masami Hiramatsu (Google)" <mhiramat@...nel.org> writes:
> From: Masami Hiramatsu (Google) <mhiramat@...nel.org>
>
> Rewrite fprobe implementation on function-graph tracer.
> Major API changes are:
> - 'nr_maxactive' field is deprecated.
> - This depends on CONFIG_DYNAMIC_FTRACE_WITH_ARGS or
> !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS, and
> CONFIG_HAVE_FUNCTION_GRAPH_FREGS. So currently works only
> on x86_64.
> - Currently the entry size is limited in 15 * sizeof(long).
> - If there is too many fprobe exit handler set on the same
> function, it will fail to probe.
>
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@...nel.org>
> ---
> Changes in v14:
> - Add ftrace_regs_get_return_addresss() for riscv.
> Changes in v12:
> - Skip updating ftrace hash if not required.
> Changes in v9:
> - Remove unneeded prototype of ftrace_regs_get_return_address().
> - Fix entry data address calculation.
> - Remove DIV_ROUND_UP() from hotpath.
> Changes in v8:
> - Use trace_func_graph_ret/ent_t for fgraph_ops.
> - Update CONFIG_FPROBE dependencies.
> - Add ftrace_regs_get_return_address() for each arch.
> Changes in v3:
> - Update for new reserve_data/retrieve_data API.
> - Fix internal push/pop on fgraph data logic so that it can
> correctly save/restore the returning fprobes.
> Changes in v2:
> - Add more lockdep_assert_held(fprobe_mutex)
> - Use READ_ONCE() and WRITE_ONCE() for fprobe_hlist_node::fp.
> - Add NOKPROBE_SYMBOL() for the functions which is called from
> entry/exit callback.
> ---
> arch/arm64/include/asm/ftrace.h | 6
> arch/loongarch/include/asm/ftrace.h | 6
> arch/powerpc/include/asm/ftrace.h | 6
> arch/riscv/include/asm/ftrace.h | 5
> arch/s390/include/asm/ftrace.h | 6
> arch/x86/include/asm/ftrace.h | 6
> include/linux/fprobe.h | 53 ++-
> kernel/trace/Kconfig | 8
> kernel/trace/fprobe.c | 639 +++++++++++++++++++++++++----------
> lib/test_fprobe.c | 45 --
> 10 files changed, 535 insertions(+), 245 deletions(-)
[..]
> diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
> index 90a3c8e2bbdf..5a0b4ef52fa7 100644
> --- a/kernel/trace/fprobe.c
> +++ b/kernel/trace/fprobe.c
> +/* The entry data size is 4 bits (=16) * sizeof(long) in maximum */
> +#define FPROBE_HEADER_SIZE_BITS 4
> +#define MAX_FPROBE_DATA_SIZE_WORD ((1L << FPROBE_HEADER_SIZE_BITS) - 1)
> +#define MAX_FPROBE_DATA_SIZE (MAX_FPROBE_DATA_SIZE_WORD * sizeof(long))
> +#define FPROBE_HEADER_PTR_BITS (BITS_PER_LONG - FPROBE_HEADER_SIZE_BITS)
> +#define FPROBE_HEADER_PTR_MASK GENMASK(FPROBE_HEADER_PTR_BITS - 1, 0)
> +#define FPROBE_HEADER_SIZE sizeof(unsigned long)
> +
> +static inline unsigned long encode_fprobe_header(struct fprobe *fp, int size_words)
> +{
> + if (WARN_ON_ONCE(size_words > MAX_FPROBE_DATA_SIZE_WORD ||
> + ((unsigned long)fp & ~FPROBE_HEADER_PTR_MASK) !=
> + ~FPROBE_HEADER_PTR_MASK)) {
> + return 0;
> }
> + return ((unsigned long)size_words << FPROBE_HEADER_PTR_BITS) |
> + ((unsigned long)fp & FPROBE_HEADER_PTR_MASK);
> +}
I haven't yet fully understood why this logic is needed, but the
WARN_ON_ONCE triggers on s390. I'm assuming this fails because fp always
has the upper bits of the address set on x86 (and likely others). As an
example, in my test setup, fp is 0x8feec218 on s390, while it is
0xffff888100add118 in x86-kvm.
Regards
Sven
Powered by blists - more mailing lists