lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b02b6c7a1fcea87248a60ee9c94aaa6e078316d0.camel@linux.ibm.com>
Date: Fri, 07 Nov 2025 09:19:47 +0100
From: Ilya Leoshkevich <iii@...ux.ibm.com>
To: Namhyung Kim <namhyung@...nel.org>
Cc: Arnaldo Carvalho de Melo <acme@...nel.org>, linux-kernel@...r.kernel.org,
        linux-perf-users@...r.kernel.org, Heiko
 Carstens <hca@...ux.ibm.com>,
        Vasily Gorbik <gor@...ux.ibm.com>,
        Alexander
 Gordeev <agordeev@...ux.ibm.com>
Subject: Re: [PATCH 1/5] perf jitdump: Fix PID namespace detection

On Thu, 2025-11-06 at 18:16 -0800, Namhyung Kim wrote:
> On Wed, Nov 05, 2025 at 08:10:24PM +0100, Ilya Leoshkevich wrote:
> > perf inject fails to detect jitdump file produced by a process
> > running in a different PID namespace if this process has not exited
> > yet.
> > 
> > The PID namespace heuristic in jit_detect() compares two PIDs:
> > 
> > * pid: outermost NStgid of mmap(jitdump) caller from perf's PoV.
> > * nsinfo__nstgid(nsi): innermost NStgid of mmap(jitdump) caller
> > from
> >                        perf's PoV.
> > 
> > The semantics of the in_pidns variable can be seen in, e.g.,
> > nsinfo__get_nspid(): it's true if and only if perf and the profiled
> > process are in different PID namespaces.
> > 
> > The current logic is clearly inverted: if pid and
> > nsinfo__nstgid(nsi)
> > are different, then the profiled process must be in a different PID
> > namespace. This, of course, ignores that fact that they may end up
> > being equal by accident, but that's not the point here.
> > 
> > Fix by flipping the comparison.
> > 
> > Changing just that, however, breaks the case when the process has
> > exited. Add explicit support for that by adding "synthesized" field
> > to
> > nsinfo, which tracks whether NStgid was obtained from a running
> > process (ignoring considerations of PID reuse or running inject on
> > a different machine). When the namespace information is
> > synthesized,
> > assume the process ran in a different PID namespace.
> 
> I'm not sure I'm following.  It'd be great if anyone understand the
> topic well could review.

Perhaps some data from the testcase from [5/5] can make it more clear.
Here are the PIDs that exist in reality:

             unshare[a] perf-record unshare[b] jshell
Host PIDNS:  1000       1001        1002       1003
PIDNS[a]:       -          1           2          3
PIDNS[b]:       -          -           -          1

In jit_detect() we deal with 2 of them.

- pid is jshell@...NS[a].
  It is taken from the MMAP2 event, this is how perf sees jshell.

- pid2 is jshell@...NS[b].
  It is taken from "jit-1.dump", this is how jshell sees itself.

- nsinfo__nstgid(nsi) ideally should be jshell@...NS[b].
  This is jshell's innermost NStgid.
  But perf can see it differently. This is the core of the problem this
  series deals with.

Why does nsinfo__nstgid(nsi) vary? Because the kernel does not record
it, and perf has to guess it. I have a WIP patch to fix that [1], but
it needs a lot more work at this point.

How does perf guess it? It looks into /proc/$PID/status. This is quite
unreliable, but this is the best perf can do under circumstances. As a
result we have 3 possibilities:

- The original process is still around. This is the buggy case. In this
  case nsinfo__nstgid(nsi) == jshell@...NS[b]. IMHO this is a very
  clear indication of namespacing, and that's why the condition should
  be flipped.

- The original process has exited and PID was not reused. I believe
  this is the use case the current code has been extensively tested
  with. In this case perf assumes there was no namespacing and
  nsinfo__nstgid(nsi) == pid. That's why I need the "synthesized"
  field: to indicate that NStgid is just an assumption and didn't come
  from any real data source.

- The original process has exited ans PID was reused. I am not trying
  to deal with this here, because this is rare and absolutely anything
  is possible. The only concern is running perf inject on a different
  machine, but I'm also not sure whether we should care about this.

[1]
https://github.com/iii-i/linux/commit/4b519b774eed850abe0757df39be13a704d2748e

[...]

> > diff --git a/tools/perf/util/namespaces.h
> > b/tools/perf/util/namespaces.h
> > index e95c79b80e27c..41ba2ea8137e5 100644
> > --- a/tools/perf/util/namespaces.h
> > +++ b/tools/perf/util/namespaces.h
> > @@ -38,6 +38,7 @@ DECLARE_RC_STRUCT(nsinfo) {
> >  	bool			in_pidns;
> >  	char			*mntns_path;
> >  	refcount_t		refcnt;
> > +	bool			synthesized;
> 
> It'd be nice if you can put this along with other bool fields.

Sure, will do.

[...]
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ