[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250829141142.3ffc8111@gandalf.local.home>
Date: Fri, 29 Aug 2025 14:11:42 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Arnaldo Carvalho de Melo <arnaldo.melo@...il.com>, Steven Rostedt
<rostedt@...nel.org>, linux-kernel@...r.kernel.org,
linux-trace-kernel@...r.kernel.org, bpf@...r.kernel.org, x86@...nel.org,
Masami Hiramatsu <mhiramat@...nel.org>, Mathieu Desnoyers
<mathieu.desnoyers@...icios.com>, Josh Poimboeuf <jpoimboe@...nel.org>,
Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...nel.org>, Jiri
Olsa <jolsa@...nel.org>, Arnaldo Carvalho de Melo <acme@...nel.org>,
Namhyung Kim <namhyung@...nel.org>, Thomas Gleixner <tglx@...utronix.de>,
Andrii Nakryiko <andrii@...nel.org>, Indu Bhagat <indu.bhagat@...cle.com>,
"Jose E. Marchesi" <jemarch@....org>, Beau Belgrave
<beaub@...ux.microsoft.com>, Jens Remus <jremus@...ux.ibm.com>, Andrew
Morton <akpm@...ux-foundation.org>, Florian Weimer <fweimer@...hat.com>,
Sam James <sam@...too.org>, Kees Cook <kees@...nel.org>, "Carlos O'Donell"
<codonell@...hat.com>
Subject: Re: [PATCH v6 5/6] tracing: Show inode and device major:minor in
deferred user space stacktrace
On Fri, 29 Aug 2025 10:33:38 -0700
Linus Torvalds <torvalds@...ux-foundation.org> wrote:
> On Fri, 29 Aug 2025 at 10:18, Arnaldo Carvalho de Melo
> <arnaldo.melo@...il.com> wrote:
> >
> > As long as we don't lose those mmap events due to memory pressure/lost
> > events and we have timestamps to order it all before lookups, yeah
> > should work.
>
> The main reason to lose mmap events that I can see is that you start
> tracing in the middle of running something (for example, tracing
> systemd or some other "started at boot" thing).
Note, for on-demand tracing, the applications are already running before
the tracing starts. That is actually the common case. Yes, people do often
"enabled tracing, run my code, stop tracing", but most of the use cases I
deal with, it's (we are noticing something in the field, start tracing,
issue gets hit, stop tracing), where the applications we are monitoring are
already running when the tracing started. Just tracing the mmap when it
happens will not be useful for us.
Not to mention, in the future, this will also have to work with JIT. I was
thinking of using 64 bit hashes in the stack trace, where the top bits are
reserved for context (is this a file, or something dynamically created).
>
> Then you'd not have any record of an actual mmap at all because it
> happened before you started tracing, even if there is no memory
> pressure or other thing going on.
>
> That is not necessarily a show-stopper: you could have some fairly
> simple count for "how many times have I seen this hash", and add a
> "mmap reminder" event (which would just be the exact same thing as the
> regular mmap event).
I thought about clearing the file cache periodically, if for any other
reason, but for dropped events where the mapping is lost.
This is why I'm looking at clearing on "unmap". Yes, we don't care about
unmap, but as soon as an unmap happens if that value gets used again then
we know it's a new mapping. That is, dropped the hashes out of the file
cache when they are no longer around.
The idea is this (pseudo code):
user_stack_trace() {
foreach vma in each stack frame:
key = hash(vma->vm_file);
if (!lookup(key)) {
trace_file_map(key, generate_path(vma), generate_buildid(vma));
add_into_hash(key);
}
}
}
On unmmaping:
key = hash(vma->vm_file);
remove_from_hash(key);
Now if a new mmap happens where the vma->vm_file is reused, the lookup(key)
will return false again and the file_map event will get triggered again.
We don't need to look at the mmap() calls, as those new mappings may never
end up in a user stack trace, and writing them out will just waste space in
the ring buffer.
-- Steve
Powered by blists - more mailing lists