[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250401173249.42d43a28@gandalf.local.home>
Date: Tue, 1 Apr 2025 17:32:49 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: Andrii Nakryiko <andrii@...nel.org>
Cc: linux-trace-kernel@...r.kernel.org, peterz@...radead.org,
mingo@...nel.org, bpf@...r.kernel.org, linux-kernel@...r.kernel.org,
kernel-team@...a.com, mhocko@...nel.org, oleg@...hat.com,
brauner@...nel.org, glider@...gle.com, mhiramat@...nel.org,
mathieu.desnoyers@...icios.com, akpm@...ux-foundation.org
Subject: Re: [PATCH] exit: add trace_task_exit() tracepoint before
current->mm is reset
On Tue, 1 Apr 2025 11:40:21 -0700
Andrii Nakryiko <andrii@...nel.org> wrote:
Hi Andrii,
> It is useful to be able to access current->mm to, say, record a bunch of
> VMA information right before the task exits (e.g., for stack
> symbolization reasons when dealing with short-lived processes that exit
> in the middle of profiling session). We currently do have
> trace_sched_process_exit() in the exit path, but it is called a bit too
> late, after exit_mm() resets current->mm to NULL, which makes it
> unsuitable for inspecting and recording task's mm_struct-related data
> when tracing process lifetimes.
My fear of adding another task exit trace event is that it will get a
bit confusing as that we now have trace_sched_process_exit() and also
trace_task_exit() with slightly different semantics.
How about adding a trace_exit_mm()? Add that to the exit_mm() code?
static void exit_mm(void)
{
struct mm_struct *mm = current->mm;
exit_mm_release(current, mm);
trace_exit_mm(mm);
??
>
> There is a particularly suitable place, though, right after
> taskstats_exit() is called, but before we do exit_mm(). taskstats
> performs a similar kind of accounting that some applications do with
> BPF, and so co-locating them seems like a good fit.
>
> Moving trace_sched_process_exit() a bit earlier would solve this problem
> as well, and I'm open to that. But this might potentially change its
> semantics a little, and so instead of risking that, I went for adding
> a new trace_task_exit() tracepoint instead. Tracepoints have zero
> overhead at runtime, unless actively traced, so this seems acceptable.
>
> Also, existing trace_sched_process_exit() tracepoint is notoriously
> missing `group_dead` flag that is certainly useful in practice and some
> of our production applications have to work around this. So plumb
> `group_dead` through while at it, to have a richer and more complete
> tracepoint.
There should be no problem adding group_dead to the
trace_sched_process_exit() trace event. Adding fields should never cause
any user API breakage.
-- Steve
Powered by blists - more mailing lists