lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <66fe9288-05ff-1bf3-dd40-e7b3cbdf1af9@sangfor.com.cn>
Date:   Fri, 14 Apr 2023 15:44:03 +0800
From:   Donglin Peng <pengdonglin@...gfor.com.cn>
To:     mhiramat@...nel.org, rostedt@...dmis.org, linux@...linux.org.uk,
        mark.rutland@....com, will@...nel.org, catalin.marinas@....com,
        rmk+kernel@...linux.org.uk, palmer@...belt.com,
        paul.walmsley@...ive.com, aou@...s.berkeley.edu,
        tglx@...utronix.de, dave.hansen@...ux.intel.com, x86@...nel.org,
        bp@...en8.de, hpa@...or.com, chenhuacai@...nel.org,
        zhangqing@...ngson.cn, kernel@...0n.name, mingo@...hat.com,
        peterz@...radead.org, xiehuan09@...il.com, dinghui@...gfor.com.cn,
        huangcun@...gfor.com.cn, dolinux.peng@...il.com
Cc:     linux-trace-kernel@...r.kernel.org, loongarch@...ts.linux.dev,
        linux-riscv@...ts.infradead.org,
        linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
        Florian Kauer <florian.kauer@...utronix.de>
Subject: Re: [PATCH v11 1/8] function_graph: Support recording and printing
 the return value of function

On 2023/4/8 20:42, Donglin Peng wrote:
> Analyzing system call failures with the function_graph tracer can be a
> time-consuming process, particularly when locating the kernel function
> that first returns an error in the trace logs. This change aims to
> simplify the process by recording the function return value to the
> 'retval' member of 'ftrace_graph_ent' and printing it when outputting

There is a mistake, the retval member is added to ftrace_graph_ret.

> the trace log.
> 
> We have introduced new trace options: funcgraph-retval and
> funcgraph-retval-hex. The former controls whether to display the return
> value, while the latter controls the display format.
> 
> Please note that even if a function's return type is void, a return
> value will still be printed. You can simply ignore it.
> 
> This patch only establishes the fundamental infrastructure. Subsequent
> patches will make this feature available on some commonly used processor
> architectures.
> 
> Here is an example:
> 
> I attempted to attach the demo process to a cpu cgroup, but it failed:
> 
> echo `pidof demo` > /sys/fs/cgroup/cpu/test/tasks
> -bash: echo: write error: Invalid argument
> 
> The strace logs indicate that the write system call returned -EINVAL(-22):
> ...
> write(1, "273\n", 4)                    = -1 EINVAL (Invalid argument)
> ...
> 
> To capture trace logs during a write system call, use the following
> commands:
> 
> cd /sys/kernel/debug/tracing/
> echo 0 > tracing_on
> echo > trace
> echo *sys_write > set_graph_function
> echo *spin* > set_graph_notrace
> echo *rcu* >> set_graph_notrace
> echo *alloc* >> set_graph_notrace
> echo preempt* >> set_graph_notrace
> echo kfree* >> set_graph_notrace
> echo $$ > set_ftrace_pid
> echo function_graph > current_tracer
> echo 1 > options/funcgraph-retval
> echo 0 > options/funcgraph-retval-hex
> echo 1 > tracing_on
> echo `pidof demo` > /sys/fs/cgroup/cpu/test/tasks
> echo 0 > tracing_on
> cat trace > ~/trace.log
> 
> To locate the root cause, search for error code -22 directly in the file
> trace.log and identify the first function that returned -22. Once you
> have identified this function, examine its code to determine the root
> cause.
> 
> For example, in the trace log below, cpu_cgroup_can_attach
> returned -22 first, so we can focus our analysis on this function to
> identify the root cause.
> 
> ...
> 
>   1)          | cgroup_migrate() {
>   1) 0.651 us |   cgroup_migrate_add_task(); /* = 0xffff93fcfd346c00 */
>   1)          |   cgroup_migrate_execute() {
>   1)          |     cpu_cgroup_can_attach() {
>   1)          |       cgroup_taskset_first() {
>   1) 0.732 us |         cgroup_taskset_next(); /* = 0xffff93fc8fb20000 */
>   1) 1.232 us |       } /* cgroup_taskset_first = 0xffff93fc8fb20000 */
>   1) 0.380 us |       sched_rt_can_attach(); /* = 0x0 */
>   1) 2.335 us |     } /* cpu_cgroup_can_attach = -22 */
>   1) 4.369 us |   } /* cgroup_migrate_execute = -22 */
>   1) 7.143 us | } /* cgroup_migrate = -22 */
> 
> ...
> 
> Tested-by: Florian Kauer <florian.kauer@...utronix.de>
> Acked-by: Masami Hiramatsu (Google) <mhiramat@...nel.org>
> Signed-off-by: Donglin Peng <pengdonglin@...gfor.com.cn>
> ---
> v11:
>   - Update the Kconfig for FUNCTION_GRAPH_RETVAL.
> 
> v9:
>   - Update comments for ftrace_return_to_handler
> 
> v8:
>   - Add some comments on CONFIG_HAVE_FUNCTION_GRAPH_RETVAL
>   - Make CONFIG_FUNCTION_GRAPH_RETVAL switable
> 
> v7:
>   - Rename trace option 'graph_retval_hex' to 'funcgraph-retval-hex'
>   - Separate each architecture modification info individual patches
>   - Introduce a new structure fgraph_ret_regs for each architecture to
>     hold return registers
> 
> v6:
>   - Remove the conversion code for short and char types, because these
>     two types are rarely used to store an error code.
> 
> v5:
>   - Pass both the return values to ftrace_return_to_handler
>   - Modify the parameter sequence of ftrace_return_to_handler to
>     decrease the modification of assembly code, thanks to Russell King
>   - Wrap __ftrace_return_to_handler with ftrace_return_to_handler
>     for compatible
> 
> v4:
>   - Modify commit message
>   - Introduce new option graph_retval_hex to control display format
>   - Introduce macro CONFIG_FUNCTION_GRAPH_RETVAL and
>     CONFIG_HAVE_FUNCTION_GRAPH_RETVAL
>   - Add related arch maintainers to review
> 
> v3:
>   - Modify the commit message: add trace logs processed with the btf tool
> 
> v2:
>   - Modify the commit message: use BTF to get the return type of function
> ---
>   include/linux/ftrace.h               |  3 +
>   kernel/trace/Kconfig                 | 15 +++++
>   kernel/trace/fgraph.c                | 23 ++++++-
>   kernel/trace/trace.h                 |  2 +
>   kernel/trace/trace_entries.h         | 26 ++++++++
>   kernel/trace/trace_functions_graph.c | 93 +++++++++++++++++++++++++---
>   6 files changed, 151 insertions(+), 11 deletions(-)
> 
> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index 402fc061de75..7639a70e17c1 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -1032,6 +1032,9 @@ struct ftrace_graph_ent {
>    */
>   struct ftrace_graph_ret {
>   	unsigned long func; /* Current function */
> +#ifdef CONFIG_FUNCTION_GRAPH_RETVAL
> +	unsigned long retval;
> +#endif
>   	int depth;
>   	/* Number of functions that overran the depth limit for current task */
>   	unsigned int overrun;
> diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
> index a856d4a34c67..fb73a43be261 100644
> --- a/kernel/trace/Kconfig
> +++ b/kernel/trace/Kconfig
> @@ -31,6 +31,9 @@ config HAVE_FUNCTION_GRAPH_TRACER
>   	help
>   	  See Documentation/trace/ftrace-design.rst
>   
> +config HAVE_FUNCTION_GRAPH_RETVAL
> +	bool
> +
>   config HAVE_DYNAMIC_FTRACE
>   	bool
>   	help
> @@ -227,6 +230,18 @@ config FUNCTION_GRAPH_TRACER
>   	  the return value. This is done by setting the current return
>   	  address on the current task structure into a stack of calls.
>   
> +config FUNCTION_GRAPH_RETVAL
> +	bool "Kernel Function Graph Return Value"
> +	depends on HAVE_FUNCTION_GRAPH_RETVAL
> +	depends on FUNCTION_GRAPH_TRACER
> +	default n
> +	help
> +	  Support recording and printing the function return value when
> +	  using function graph tracer. It can be helpful to locate functions
> +	  that return errors. This feature is off by default, and you can
> +	  enable it via the trace option funcgraph-retval.
> +	  See Documentation/trace/ftrace.rst
> +
>   config DYNAMIC_FTRACE
>   	bool "enable/disable function tracing dynamically"
>   	depends on FUNCTION_TRACER
> diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
> index 218cd95bf8e4..170ec9429efb 100644
> --- a/kernel/trace/fgraph.c
> +++ b/kernel/trace/fgraph.c
> @@ -240,12 +240,16 @@ static struct notifier_block ftrace_suspend_notifier = {
>    * Send the trace to the ring-buffer.
>    * @return the original return address.
>    */
> -unsigned long ftrace_return_to_handler(unsigned long frame_pointer)
> +static unsigned long __ftrace_return_to_handler(struct fgraph_ret_regs *ret_regs,
> +						unsigned long frame_pointer)
>   {
>   	struct ftrace_graph_ret trace;
>   	unsigned long ret;
>   
>   	ftrace_pop_return_trace(&trace, &ret, frame_pointer);
> +#ifdef CONFIG_FUNCTION_GRAPH_RETVAL
> +	trace.retval = fgraph_ret_regs_return_value(ret_regs);
> +#endif
>   	trace.rettime = trace_clock_local();
>   	ftrace_graph_return(&trace);
>   	/*
> @@ -266,6 +270,23 @@ unsigned long ftrace_return_to_handler(unsigned long frame_pointer)
>   	return ret;
>   }
>   
> +/*
> + * After all architecures have selected HAVE_FUNCTION_GRAPH_RETVAL, we can
> + * leave only ftrace_return_to_handler(ret_regs).
> + */
> +#ifdef CONFIG_HAVE_FUNCTION_GRAPH_RETVAL
> +unsigned long ftrace_return_to_handler(struct fgraph_ret_regs *ret_regs)
> +{
> +	return __ftrace_return_to_handler(ret_regs,
> +				fgraph_ret_regs_frame_pointer(ret_regs));
> +}
> +#else
> +unsigned long ftrace_return_to_handler(unsigned long frame_pointer)
> +{
> +	return __ftrace_return_to_handler(NULL, frame_pointer);
> +}
> +#endif
> +
>   /**
>    * ftrace_graph_get_ret_stack - return the entry of the shadow stack
>    * @task: The task to read the shadow stack from
> diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> index 616e1aa1c4da..0a3c4582df10 100644
> --- a/kernel/trace/trace.h
> +++ b/kernel/trace/trace.h
> @@ -831,6 +831,8 @@ static __always_inline bool ftrace_hash_empty(struct ftrace_hash *hash)
>   #define TRACE_GRAPH_PRINT_TAIL          0x100
>   #define TRACE_GRAPH_SLEEP_TIME          0x200
>   #define TRACE_GRAPH_GRAPH_TIME          0x400
> +#define TRACE_GRAPH_PRINT_RETVAL        0x800
> +#define TRACE_GRAPH_PRINT_RETVAL_HEX    0x1000
>   #define TRACE_GRAPH_PRINT_FILL_SHIFT	28
>   #define TRACE_GRAPH_PRINT_FILL_MASK	(0x3 << TRACE_GRAPH_PRINT_FILL_SHIFT)
>   
> diff --git a/kernel/trace/trace_entries.h b/kernel/trace/trace_entries.h
> index cd41e863b51c..340b2fa98218 100644
> --- a/kernel/trace/trace_entries.h
> +++ b/kernel/trace/trace_entries.h
> @@ -86,6 +86,30 @@ FTRACE_ENTRY_PACKED(funcgraph_entry, ftrace_graph_ent_entry,
>   );
>   
>   /* Function return entry */
> +#ifdef CONFIG_FUNCTION_GRAPH_RETVAL
> +
> +FTRACE_ENTRY_PACKED(funcgraph_exit, ftrace_graph_ret_entry,
> +
> +	TRACE_GRAPH_RET,
> +
> +	F_STRUCT(
> +		__field_struct(	struct ftrace_graph_ret,	ret	)
> +		__field_packed(	unsigned long,	ret,		func	)
> +		__field_packed(	unsigned long,	ret,		retval	)
> +		__field_packed(	int,		ret,		depth	)
> +		__field_packed(	unsigned int,	ret,		overrun	)
> +		__field_packed(	unsigned long long, ret,	calltime)
> +		__field_packed(	unsigned long long, ret,	rettime	)
> +	),
> +
> +	F_printk("<-- %ps (%d) (start: %llx  end: %llx) over: %d retval: %lx",
> +		 (void *)__entry->func, __entry->depth,
> +		 __entry->calltime, __entry->rettime,
> +		 __entry->depth, __entry->retval)
> +);
> +
> +#else
> +
>   FTRACE_ENTRY_PACKED(funcgraph_exit, ftrace_graph_ret_entry,
>   
>   	TRACE_GRAPH_RET,
> @@ -105,6 +129,8 @@ FTRACE_ENTRY_PACKED(funcgraph_exit, ftrace_graph_ret_entry,
>   		 __entry->depth)
>   );
>   
> +#endif
> +
>   /*
>    * Context switch trace entry - which task (and prio) we switched from/to:
>    *
> diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
> index 203204cadf92..c35fbaab2a47 100644
> --- a/kernel/trace/trace_functions_graph.c
> +++ b/kernel/trace/trace_functions_graph.c
> @@ -58,6 +58,12 @@ static struct tracer_opt trace_opts[] = {
>   	{ TRACER_OPT(funcgraph-irqs, TRACE_GRAPH_PRINT_IRQS) },
>   	/* Display function name after trailing } */
>   	{ TRACER_OPT(funcgraph-tail, TRACE_GRAPH_PRINT_TAIL) },
> +#ifdef CONFIG_FUNCTION_GRAPH_RETVAL
> +	/* Display function return value ? */
> +	{ TRACER_OPT(funcgraph-retval, TRACE_GRAPH_PRINT_RETVAL) },
> +	/* Display function return value in hexadecimal format ? */
> +	{ TRACER_OPT(funcgraph-retval-hex, TRACE_GRAPH_PRINT_RETVAL_HEX) },
> +#endif
>   	/* Include sleep time (scheduled out) between entry and return */
>   	{ TRACER_OPT(sleep-time, TRACE_GRAPH_SLEEP_TIME) },
>   
> @@ -619,6 +625,56 @@ print_graph_duration(struct trace_array *tr, unsigned long long duration,
>   	trace_seq_puts(s, "|  ");
>   }
>   
> +#ifdef CONFIG_FUNCTION_GRAPH_RETVAL
> +
> +#define __TRACE_GRAPH_PRINT_RETVAL TRACE_GRAPH_PRINT_RETVAL
> +
> +static void print_graph_retval(struct trace_seq *s, unsigned long retval,
> +				bool leaf, void *func, bool hex_format)
> +{
> +	unsigned long err_code = 0;
> +
> +	if (retval == 0 || hex_format)
> +		goto done;
> +
> +	/* Check if the return value matches the negative format */
> +	if (IS_ENABLED(CONFIG_64BIT) && (retval & BIT(31)) &&
> +		(((u64)retval) >> 32) == 0) {
> +		/* sign extension */
> +		err_code = (unsigned long)(s32)retval;
> +	} else {
> +		err_code = retval;
> +	}
> +
> +	if (!IS_ERR_VALUE(err_code))
> +		err_code = 0;
> +
> +done:
> +	if (leaf) {
> +		if (hex_format || (err_code == 0))
> +			trace_seq_printf(s, "%ps(); /* = 0x%lx */\n",
> +					func, retval);
> +		else
> +			trace_seq_printf(s, "%ps(); /* = %ld */\n",
> +					func, err_code);
> +	} else {
> +		if (hex_format || (err_code == 0))
> +			trace_seq_printf(s, "} /* %ps = 0x%lx */\n",
> +					func, retval);
> +		else
> +			trace_seq_printf(s, "} /* %ps = %ld */\n",
> +					func, err_code);
> +	}
> +}
> +
> +#else
> +
> +#define __TRACE_GRAPH_PRINT_RETVAL 0
> +
> +#define print_graph_retval(_seq, _retval, _leaf, _func, _format) do {} while (0)
> +
> +#endif
> +
>   /* Case of a leaf function on its call entry */
>   static enum print_line_t
>   print_graph_entry_leaf(struct trace_iterator *iter,
> @@ -663,7 +719,15 @@ print_graph_entry_leaf(struct trace_iterator *iter,
>   	for (i = 0; i < call->depth * TRACE_GRAPH_INDENT; i++)
>   		trace_seq_putc(s, ' ');
>   
> -	trace_seq_printf(s, "%ps();\n", (void *)call->func);
> +	/*
> +	 * Write out the function return value if the option function-retval is
> +	 * enabled.
> +	 */
> +	if (flags & __TRACE_GRAPH_PRINT_RETVAL)
> +		print_graph_retval(s, graph_ret->retval, true, (void *)call->func,
> +				!!(flags & TRACE_GRAPH_PRINT_RETVAL_HEX));
> +	else
> +		trace_seq_printf(s, "%ps();\n", (void *)call->func);
>   
>   	print_graph_irq(iter, graph_ret->func, TRACE_GRAPH_RET,
>   			cpu, iter->ent->pid, flags);
> @@ -942,16 +1006,25 @@ print_graph_return(struct ftrace_graph_ret *trace, struct trace_seq *s,
>   		trace_seq_putc(s, ' ');
>   
>   	/*
> -	 * If the return function does not have a matching entry,
> -	 * then the entry was lost. Instead of just printing
> -	 * the '}' and letting the user guess what function this
> -	 * belongs to, write out the function name. Always do
> -	 * that if the funcgraph-tail option is enabled.
> +	 * Always write out the function name and its return value if the
> +	 * function-retval option is enabled.
>   	 */
> -	if (func_match && !(flags & TRACE_GRAPH_PRINT_TAIL))
> -		trace_seq_puts(s, "}\n");
> -	else
> -		trace_seq_printf(s, "} /* %ps */\n", (void *)trace->func);
> +	if (flags & __TRACE_GRAPH_PRINT_RETVAL) {
> +		print_graph_retval(s, trace->retval, false, (void *)trace->func,
> +			!!(flags & TRACE_GRAPH_PRINT_RETVAL_HEX));
> +	} else {
> +		/*
> +		 * If the return function does not have a matching entry,
> +		 * then the entry was lost. Instead of just printing
> +		 * the '}' and letting the user guess what function this
> +		 * belongs to, write out the function name. Always do
> +		 * that if the funcgraph-tail option is enabled.
> +		 */
> +		if (func_match && !(flags & TRACE_GRAPH_PRINT_TAIL))
> +			trace_seq_puts(s, "}\n");
> +		else
> +			trace_seq_printf(s, "} /* %ps */\n", (void *)trace->func);
> +	}
>   
>   	/* Overrun */
>   	if (flags & TRACE_GRAPH_PRINT_OVERRUN)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ