[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a7b8518583b7c19756c2e27f4bfc1afc696d87fe.camel@linux.ibm.com>
Date: Fri, 14 Nov 2025 13:44:30 +0100
From: Ilya Leoshkevich <iii@...ux.ibm.com>
To: Namhyung Kim <namhyung@...nel.org>
Cc: Arnaldo Carvalho de Melo <acme@...nel.org>, linux-kernel@...r.kernel.org,
linux-perf-users@...r.kernel.org, Heiko
Carstens <hca@...ux.ibm.com>,
Vasily Gorbik <gor@...ux.ibm.com>,
Alexander
Gordeev <agordeev@...ux.ibm.com>
Subject: Re: [PATCH 1/5] perf jitdump: Fix PID namespace detection
On Fri, 2025-11-14 at 00:07 -0800, Namhyung Kim wrote:
> Hello,
>
> Sorry for the delay.
>
> On Fri, Nov 07, 2025 at 09:19:47AM +0100, Ilya Leoshkevich wrote:
> > On Thu, 2025-11-06 at 18:16 -0800, Namhyung Kim wrote:
> > > On Wed, Nov 05, 2025 at 08:10:24PM +0100, Ilya Leoshkevich wrote:
> > > > perf inject fails to detect jitdump file produced by a process
> > > > running in a different PID namespace if this process has not
> > > > exited
> > > > yet.
> > > >
> > > > The PID namespace heuristic in jit_detect() compares two PIDs:
> > > >
> > > > * pid: outermost NStgid of mmap(jitdump) caller from perf's
> > > > PoV.
> > > > * nsinfo__nstgid(nsi): innermost NStgid of mmap(jitdump) caller
> > > > from
> > > > perf's PoV.
> > > >
> > > > The semantics of the in_pidns variable can be seen in, e.g.,
> > > > nsinfo__get_nspid(): it's true if and only if perf and the
> > > > profiled
> > > > process are in different PID namespaces.
> > > >
> > > > The current logic is clearly inverted: if pid and
> > > > nsinfo__nstgid(nsi)
> > > > are different, then the profiled process must be in a different
> > > > PID
> > > > namespace. This, of course, ignores that fact that they may end
> > > > up
> > > > being equal by accident, but that's not the point here.
> > > >
> > > > Fix by flipping the comparison.
> > > >
> > > > Changing just that, however, breaks the case when the process
> > > > has
> > > > exited. Add explicit support for that by adding "synthesized"
> > > > field
> > > > to
> > > > nsinfo, which tracks whether NStgid was obtained from a running
> > > > process (ignoring considerations of PID reuse or running inject
> > > > on
> > > > a different machine). When the namespace information is
> > > > synthesized,
> > > > assume the process ran in a different PID namespace.
> > >
> > > I'm not sure I'm following. It'd be great if anyone understand
> > > the
> > > topic well could review.
> >
> > Perhaps some data from the testcase from [5/5] can make it more
> > clear.
> > Here are the PIDs that exist in reality:
> >
> > unshare[a] perf-record unshare[b] jshell
> > Host PIDNS: 1000 1001 1002 1003
> > PIDNS[a]: - 1 2 3
> > PIDNS[b]: - - - 1
> >
> > In jit_detect() we deal with 2 of them.
> >
> > - pid is jshell@...NS[a].
> > It is taken from the MMAP2 event, this is how perf sees jshell.
> >
> > - pid2 is jshell@...NS[b].
> > It is taken from "jit-1.dump", this is how jshell sees itself.
> >
> > - nsinfo__nstgid(nsi) ideally should be jshell@...NS[b].
> > This is jshell's innermost NStgid.
> > But perf can see it differently. This is the core of the problem
> > this
> > series deals with.
>
> Thanks a lot for the example and explanation! I'm trying to
> understand.
> :)
Thank you for the patience!
> > Why does nsinfo__nstgid(nsi) vary? Because the kernel does not
> > record
> > it, and perf has to guess it. I have a WIP patch to fix that [1],
> > but
> > it needs a lot more work at this point.
> >
> > How does perf guess it? It looks into /proc/$PID/status. This is
> > quite
> > unreliable, but this is the best perf can do under circumstances.
> > As a
> > result we have 3 possibilities:
> >
> > - The original process is still around. This is the buggy case. In
> > this
> > case nsinfo__nstgid(nsi) == jshell@...NS[b]. IMHO this is a very
> > clear indication of namespacing, and that's why the condition
> > should
> > be flipped.
>
> So perf would look at /proc/3/status and the file would have the
> below
>
> NStgid: 1003 3 1
>
> and *in_pidns should be true, right?
It should be
NStgid: 3 1
because perf itself is namespaced too - in this testcase nobody sees
the root PID namespace. I wrote it this way to make sure there are
no accidental leaks from the root PID namespace, but it's not very
important and perhaps obscures things somewhat unnecessarily.
But regardless, I believe that in this case setting *in_pidns to true
is the right thing.
> > - The original process has exited and PID was not reused. I believe
> > this is the use case the current code has been extensively tested
> > with. In this case perf assumes there was no namespacing and
> > nsinfo__nstgid(nsi) == pid. That's why I need the "synthesized"
> > field: to indicate that NStgid is just an assumption and didn't
> > come
> > from any real data source.
>
> Ok, can you please put short comments on each boolean field so that
> we
> can see the meaning of them clearly?
Sure, I can add this in v2.
[...]
Powered by blists - more mailing lists