[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250429100007.3225e7eb@gandalf.local.home>
Date: Tue, 29 Apr 2025 10:00:07 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: Namhyung Kim <namhyung@...nel.org>
Cc: linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org, Masami
Hiramatsu <mhiramat@...nel.org>, Mark Rutland <mark.rutland@....com>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, Andrew Morton
<akpm@...ux-foundation.org>, Josh Poimboeuf <jpoimboe@...nel.org>,
x86@...nel.org, Peter Zijlstra <peterz@...radead.org>, Ingo Molnar
<mingo@...nel.org>, Arnaldo Carvalho de Melo <acme@...nel.org>, Indu Bhagat
<indu.bhagat@...cle.com>, Alexander Shishkin
<alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>, Ian
Rogers <irogers@...gle.com>, Adrian Hunter <adrian.hunter@...el.com>,
linux-perf-users@...r.kernel.org, Mark Brown <broonie@...nel.org>,
linux-toolchains@...r.kernel.org, Jordan Rome <jordalgo@...a.com>, Sam
James <sam@...too.org>, Andrii Nakryiko <andrii.nakryiko@...il.com>, Jens
Remus <jremus@...ux.ibm.com>, Florian Weimer <fweimer@...hat.com>, Andy
Lutomirski <luto@...nel.org>, Weinan Liu <wnliu@...gle.com>, Blake Jones
<blakejones@...gle.com>, Beau Belgrave <beaub@...ux.microsoft.com>, "Jose
E. Marchesi" <jemarch@....org>
Subject: Re: [PATCH v5 13/17] perf: Support deferred user callchains
On Mon, 28 Apr 2025 17:29:53 -0700
Namhyung Kim <namhyung@...nel.org> wrote:
> Thing is that the kernel doesn't know the relationship between events.
> For example, if I run this command on a machine with 100 CPUs:
>
> $ perf record -e cycles,instructions -- $MYPROG
>
> it would open 200 events and they don't know each other. Later other
> process can start a new perf profiling for the same task. IIUC there's
> no way to identify which one is related in the kernel.
>
> So I think we need a way to share some informaiton for those 200 events
> and then emits deferred callchain records with the shared info.
Hmm, I'm thinking of creating an internal perf descriptor that would join
events by who created them. That is, the first event created will take the
thread leader (pid of the task) and check if an entity exists for it. If
one doesn't exist it will create it and add itself to that event if it has
a deferred trace attribute set. If it already exists, it will just add
itself to it. This deferred descriptor will register itself with the
deferred unwinder like ftrace does (one per process), and then use it to
defer callbacks. When the callback happens, it will look for the thread
event or CPU event that matches the current thread or current CPU and
record the backtrace there.
>
> >
> > It could use the cookie method that ftrace uses, where the request gets a
> > cookie, and can be recorded to the perf event in the interrupt. Then the
> > callchain would record the cookie along with the stack trace, and then perf
> > tool could just match up the kernel stacks with their cookies to the user
> > stack with its cookie.
>
> Yep, but the kernel should know which events (or ring buffer) it should
> emit the deferred callchains. I don't think it needs to include the
> cookie in the perf data, but it can be used to find which event or ring
> buffer for the session is related to this request.
Let me see if my suggestion would work or not. I'll try it out and see what
happens. And post patches later.
-- Steve
Powered by blists - more mailing lists