linux-kernel - Re: [PATCH] tracing: Add new

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <202404090840.E09789B66@keescook>
Date: Tue, 9 Apr 2024 08:46:10 -0700
From: Kees Cook <keescook@...omium.org>
To: Marco Elver <elver@...gle.com>
Cc: Steven Rostedt <rostedt@...dmis.org>,
	Eric Biederman <ebiederm@...ssion.com>,
	Alexander Viro <viro@...iv.linux.org.uk>,
	Christian Brauner <brauner@...nel.org>, Jan Kara <jack@...e.cz>,
	Masami Hiramatsu <mhiramat@...nel.org>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	linux-mm@...ck.org, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
	Dmitry Vyukov <dvyukov@...gle.com>
Subject: Re: [PATCH] tracing: Add new_exec tracepoint

On Mon, Apr 08, 2024 at 11:01:54AM +0200, Marco Elver wrote:
> Add "new_exec" tracepoint, which is run right after the point of no
> return but before the current task assumes its new exec identity.
> 
> Unlike the tracepoint "sched_process_exec", the "new_exec" tracepoint
> runs before flushing the old exec, i.e. while the task still has the
> original state (such as original MM), but when the new exec either
> succeeds or crashes (but never returns to the original exec).
> 
> Being able to trace this event can be helpful in a number of use cases:
> 
>   * allowing tracing eBPF programs access to the original MM on exec,
>     before current->mm is replaced;
>   * counting exec in the original task (via perf event);
>   * profiling flush time ("new_exec" to "sched_process_exec").
> 
> Example of tracing output ("new_exec" and "sched_process_exec"):
> 
>   $ cat /sys/kernel/debug/tracing/trace_pipe
>       <...>-379     [003] .....   179.626921: new_exec: filename=/usr/bin/sshd pid=379 comm=sshd
>       <...>-379     [003] .....   179.629131: sched_process_exec: filename=/usr/bin/sshd pid=379 old_pid=379
>       <...>-381     [002] .....   180.048580: new_exec: filename=/bin/bash pid=381 comm=sshd
>       <...>-381     [002] .....   180.053122: sched_process_exec: filename=/bin/bash pid=381 old_pid=381
>       <...>-385     [001] .....   180.068277: new_exec: filename=/usr/bin/tty pid=385 comm=bash
>       <...>-385     [001] .....   180.069485: sched_process_exec: filename=/usr/bin/tty pid=385 old_pid=385
>       <...>-389     [006] .....   192.020147: new_exec: filename=/usr/bin/dmesg pid=389 comm=bash
>        bash-389     [006] .....   192.021377: sched_process_exec: filename=/usr/bin/dmesg pid=389 old_pid=389
> 
> Signed-off-by: Marco Elver <elver@...gle.com>
> ---
>  fs/exec.c                   |  2 ++
>  include/trace/events/task.h | 30 ++++++++++++++++++++++++++++++
>  2 files changed, 32 insertions(+)
> 
> diff --git a/fs/exec.c b/fs/exec.c
> index 38bf71cbdf5e..ab778ae1fc06 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1268,6 +1268,8 @@ int begin_new_exec(struct linux_binprm * bprm)
>  	if (retval)
>  		return retval;
>  
> +	trace_new_exec(current, bprm);
> +

All other steps in this function have explicit comments about
what/why/etc. Please add some kind of comment describing why the
tracepoint is where it is, etc.

For example, maybe something like:

/*
 * Before any changes to 'current', report that the exec is about to
 * happen (since we made it to the point of no return). On a successful
 * exec, the 'sched_process_exec' tracepoint will also fire. On failure,
 * ... [something else]
 */

> +TRACE_EVENT(new_exec,
> +
> +	TP_PROTO(struct task_struct *task, struct linux_binprm *bprm),
> +
> +	TP_ARGS(task, bprm),
> +
> +	TP_STRUCT__entry(
> +		__string(	filename,	bprm->filename	)
> +		__field(	pid_t,		pid		)
> +		__string(	comm,		task->comm	)
> +	),
> +
> +	TP_fast_assign(
> +		__assign_str(filename, bprm->filename);

What about binfmt_misc, and binfmt_script? You may want bprm->interp
too?

-Kees

> +		__entry->pid = task->pid;
> +		__assign_str(comm, task->comm);
> +	),
> +
> +	TP_printk("filename=%s pid=%d comm=%s",
> +		  __get_str(filename), __entry->pid, __get_str(comm))
> +);
> +
>  #endif
>  
>  /* This part must be outside protection */
> -- 
> 2.44.0.478.gd926399ef9-goog
> 

-- 
Kees Cook