[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240419235258.64cada90@rorschach.local.home>
Date: Fri, 19 Apr 2024 23:52:58 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: "Masami Hiramatsu (Google)" <mhiramat@...nel.org>
Cc: Alexei Starovoitov <alexei.starovoitov@...il.com>, Florent Revest
<revest@...omium.org>, linux-trace-kernel@...r.kernel.org, LKML
<linux-kernel@...r.kernel.org>, Martin KaFai Lau <martin.lau@...ux.dev>,
bpf <bpf@...r.kernel.org>, Sven Schnelle <svens@...ux.ibm.com>, Alexei
Starovoitov <ast@...nel.org>, Jiri Olsa <jolsa@...nel.org>, Arnaldo
Carvalho de Melo <acme@...nel.org>, Daniel Borkmann <daniel@...earbox.net>,
Alan Maguire <alan.maguire@...cle.com>, Mark Rutland
<mark.rutland@....com>, Peter Zijlstra <peterz@...radead.org>, Thomas
Gleixner <tglx@...utronix.de>, Guo Ren <guoren@...nel.org>
Subject: Re: [PATCH v9 07/36] function_graph: Allow multiple users to attach
to function graph
On Mon, 15 Apr 2024 21:50:20 +0900
"Masami Hiramatsu (Google)" <mhiramat@...nel.org> wrote:
> @@ -27,23 +28,157 @@
>
> #define FGRAPH_RET_SIZE sizeof(struct ftrace_ret_stack)
> #define FGRAPH_RET_INDEX DIV_ROUND_UP(FGRAPH_RET_SIZE, sizeof(long))
> +
> +/*
> + * On entry to a function (via function_graph_enter()), a new ftrace_ret_stack
> + * is allocated on the task's ret_stack with indexes entry, then each
> + * fgraph_ops on the fgraph_array[]'s entryfunc is called and if that returns
> + * non-zero, the index into the fgraph_array[] for that fgraph_ops is recorded
> + * on the indexes entry as a bit flag.
> + * As the associated ftrace_ret_stack saved for those fgraph_ops needs to
> + * be found, the index to it is also added to the ret_stack along with the
> + * index of the fgraph_array[] to each fgraph_ops that needs their retfunc
> + * called.
> + *
> + * The top of the ret_stack (when not empty) will always have a reference
> + * to the last ftrace_ret_stack saved. All references to the
> + * ftrace_ret_stack has the format of:
> + *
> + * bits: 0 - 9 offset in words from the previous ftrace_ret_stack
> + * (bitmap type should have FGRAPH_RET_INDEX always)
> + * bits: 10 - 11 Type of storage
> + * 0 - reserved
> + * 1 - bitmap of fgraph_array index
> + *
> + * For bitmap of fgraph_array index
> + * bits: 12 - 27 The bitmap of fgraph_ops fgraph_array index
I really hate the terminology I came up with here, and would love to
get better terminology for describing what is going on. I looked it
over but I'm constantly getting confused. And I wrote this code!
Perhaps we should use:
@frame : The data that represents a single function call. When a
function is traced, all the data used for all the callbacks
attached to it, is in a single frame. This would replace the
FGRAPH_RET_SIZE as FGRAPH_FRAME_SIZE.
@offset : This is the word size position on the stack. It would
replace INDEX, as I think "index" is being used for more
than one thing. Perhaps it should be "offset" when dealing
with where it is on the shadow stack, and "pos" when dealing
with which callback ops is being referenced.
> + *
> + * That is, at the end of function_graph_enter, if the first and forth
> + * fgraph_ops on the fgraph_array[] (index 0 and 3) needs their retfunc called
> + * on the return of the function being traced, this is what will be on the
> + * task's shadow ret_stack: (the stack grows upward)
> + *
> + * | | <- task->curr_ret_stack
> + * +--------------------------------------------+
> + * | bitmap_type(bitmap:(BIT(3)|BIT(0)), |
> + * | offset:FGRAPH_RET_INDEX) | <- the offset is from here
> + * +--------------------------------------------+
> + * | struct ftrace_ret_stack |
> + * | (stores the saved ret pointer) | <- the offset points here
> + * +--------------------------------------------+
> + * | (X) | (N) | ( N words away from
> + * | | previous ret_stack)
> + *
> + * If a backtrace is required, and the real return pointer needs to be
> + * fetched, then it looks at the task's curr_ret_stack index, if it
> + * is greater than zero (reserved, or right before poped), it would mask
> + * the value by FGRAPH_RET_INDEX_MASK to get the offset index of the
> + * ftrace_ret_stack structure stored on the shadow stack.
> + */
> +
> +#define FGRAPH_RET_INDEX_SIZE 10
Replace SIZE with BITS.
> +#define FGRAPH_RET_INDEX_MASK GENMASK(FGRAPH_RET_INDEX_SIZE - 1, 0)
#define FGRAPH_FRAME_SIZE_BITS 10
#define FGRAPH_FRAME_SIZE_MASK GENMASK(FGRAPH_FRAME_SIZE_BITS - 1, 0)
> +
> +#define FGRAPH_TYPE_SIZE 2
> +#define FGRAPH_TYPE_MASK GENMASK(FGRAPH_TYPE_SIZE - 1, 0)
#define FGRAPH_TYPE_BITS 2
#define FGRAPH_TYPE_MASK GENMASK(FGRAPH_TYPE_BITS - 1, 0)
> +#define FGRAPH_TYPE_SHIFT FGRAPH_RET_INDEX_SIZE
> +
> +enum {
> + FGRAPH_TYPE_RESERVED = 0,
> + FGRAPH_TYPE_BITMAP = 1,
> +};
> +
> +#define FGRAPH_INDEX_SIZE 16
replace "INDEX" with "OPS" as it will be the indexes of ops in the
array.
#define FGRAPH_OPS_BITS 16
#define FGRAPH_OPS_MASK GENMASK(FGRAPH_OPS_BITS - 1, 0)
> +#define FGRAPH_INDEX_MASK GENMASK(FGRAPH_INDEX_SIZE - 1, 0)
> +#define FGRAPH_INDEX_SHIFT (FGRAPH_TYPE_SHIFT + FGRAPH_TYPE_SIZE)
> +
> +/* Currently the max stack index can't be more than register callers */
> +#define FGRAPH_MAX_INDEX (FGRAPH_INDEX_SIZE + FGRAPH_RET_INDEX)
FGRAPH_MAX_INDEX isn't even used. Let's delete it.
> +
> +#define FGRAPH_ARRAY_SIZE FGRAPH_INDEX_SIZE
#define FGRAPH_ARRAY_SIZE FGRAPH_INDEX_BITS
> +
> #define SHADOW_STACK_SIZE (PAGE_SIZE)
> #define SHADOW_STACK_INDEX (SHADOW_STACK_SIZE / sizeof(long))
> /* Leave on a buffer at the end */
> -#define SHADOW_STACK_MAX_INDEX (SHADOW_STACK_INDEX - FGRAPH_RET_INDEX)
> +#define SHADOW_STACK_MAX_INDEX (SHADOW_STACK_INDEX - (FGRAPH_RET_INDEX + 1))
We probably should rename this is previous patches as well.
Unfortunately, it's getting close to the time for me to pick up my wife
from the airport to start our vacation. But I think we should rename a
lot of these variables to make things more consistent.
I'll try to look more at the previous patches as well to make my
comments there, when I get some time. Maybe even later today.
-- Steve
Powered by blists - more mailing lists