lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 6 Feb 2022 08:46:37 -0300
From:   Arnaldo Carvalho de Melo <acme@...nel.org>
To:     Leo Yan <leo.yan@...aro.org>
Cc:     James Clark <james.clark@....com>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...hat.com>,
        Namhyung Kim <namhyung@...nel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andrii@...nel.org>,
        Martin KaFai Lau <kafai@...com>,
        Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
        John Fastabend <john.fastabend@...il.com>,
        KP Singh <kpsingh@...nel.org>, Ian Rogers <irogers@...gle.com>,
        linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1] perf synthetic-events: Return error if procfs isn't
 mounted for PID namespaces

Em Fri, Feb 04, 2022 at 08:48:49PM +0800, Leo Yan escreveu:
> On Fri, Jan 07, 2022 at 02:27:57PM +0000, James Clark wrote:
> > On 24/12/2021 12:40, Leo Yan wrote:
> > > For perf recording, it retrieves process info by iterating nodes in proc
> > > fs.  If we run perf in a non-root PID namespace with command:
> > > 
> > >   # unshare --fork --pid perf record -e cycles -a -- test_program
> > > 
> > > ... in this case, unshare command creates a child PID namespace and
> > > launches perf tool in it, but the issue is the proc fs is not mounted
> > > for the non-root PID namespace, this leads to the perf tool gathering
> > > process info from its parent PID namespace.
> > > 
> > 
> > I had some concerns that this would prevent use of perf in docker, but docker
> > does mount /proc so it is still working. And you've added a warning about how
> > to fix the unshare command so I think this is ok.
> > 
> > Reviewed-by: James Clark <james.clark@....com>
> 
> Thanks a lot for review, James.
> 
> Arnaldo, Jiri, could you take a look for this patch?  Thanks!

Looks sane, there was discussion, a detailed set of steps to reproduce
the problem was provided, thanks, applied.

- Arnaldo
 
> Leo
> 
> > > We can use below command to observe the process nodes under proc fs:
> > > 
> > >   # unshare --pid --fork ls /proc
> > > 1    137   1968  2128  3    342  48  62   78	     crypto	  kcore        net	      uptime
> > > 10   138   2	 2142  30   35	 49  63   8	     devices	  keys	       pagetypeinfo   version
> > > 11   139   20	 2143  304  36	 50  64   82	     device-tree  key-users    partitions     vmallocinfo
> > > 12   14    2011  22    305  37	 51  65   83	     diskstats	  kmsg	       self	      vmstat
> > > 128  140   2038  23    307  39	 52  656  84	     driver	  kpagecgroup  slabinfo       zoneinfo
> > > 129  15    2074  24    309  4	 53  67   9	     execdomains  kpagecount   softirqs
> > > 13   16    2094  241   31   40	 54  68   asound     fb		  kpageflags   stat
> > > 130  164   2096  242   310  41	 55  69   buddyinfo  filesystems  loadavg      swaps
> > > 131  17    2098  25    317  42	 56  70   bus	     fs		  locks        sys
> > > 132  175   21	 26    32   43	 57  71   cgroups    interrupts   meminfo      sysrq-trigger
> > > 133  179   2102  263   329  44	 58  75   cmdline    iomem	  misc	       sysvipc
> > > 134  1875  2103  27    330  45	 59  76   config.gz  ioports	  modules      thread-self
> > > 135  19    2117  29    333  46	 6   77   consoles   irq	  mounts       timer_list
> > > 136  1941  2121  298   34   47	 60  773  cpuinfo    kallsyms	  mtd	       tty
> > > 
> > > So it shows many existed tasks, since unshared command has not mounted
> > > the proc fs for the new created PID namespace, it still accesses the
> > > proc fs of the root PID namespace.  This leads to two prominent issues:
> > > 
> > > - Firstly, PID values are mismatched between thread info and samples.
> > >   The gathered thread info are coming from the proc fs of the root PID
> > >   namespace, but samples record its PID from the child PID namespace.
> > > 
> > > - The second issue is profiled program 'test_program' returns its forked
> > >   PID number from the child PID namespace, perf tool wrongly uses this
> > >   PID number to retrieve the process info via the proc fs of the root
> > >   PID namespace.
> > > 
> > > To avoid issues, we need to mount proc fs for the child PID namespace
> > > with the option '--mount-proc' when use unshare command:
> > > 
> > >   # unshare --fork --pid --mount-proc perf record -e cycles -a -- test_program
> > > 
> > > Conversely, when the proc fs of the root PID namespace is used by child
> > > namespace, perf tool can detect the multiple PID levels and
> > > nsinfo__is_in_root_namespace() returns false, this patch reports error
> > > for this case:
> > > 
> > >   # unshare --fork --pid perf record -e cycles -a -- test_program
> > >   Couldn't synthesize bpf events.
> > >   Perf runs in non-root PID namespace but it tries to gather process info from its parent PID namespace.
> > >   Please mount the proc file system properly, e.g. add the option '--mount-proc' for unshare command.
> > > 
> > > Signed-off-by: Leo Yan <leo.yan@...aro.org>
> > > ---
> > >  tools/perf/util/synthetic-events.c | 19 +++++++++++++++++++
> > >  1 file changed, 19 insertions(+)
> > > 
> > > diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
> > > index 198982109f0f..cf7800347f77 100644
> > > --- a/tools/perf/util/synthetic-events.c
> > > +++ b/tools/perf/util/synthetic-events.c
> > > @@ -1784,6 +1784,25 @@ int __machine__synthesize_threads(struct machine *machine, struct perf_tool *too
> > >  				  perf_event__handler_t process, bool needs_mmap,
> > >  				  bool data_mmap, unsigned int nr_threads_synthesize)
> > >  {
> > > +	/*
> > > +	 * When perf runs in non-root PID namespace, and the namespace's proc FS
> > > +	 * is not mounted, nsinfo__is_in_root_namespace() returns false.
> > > +	 * In this case, the proc FS is coming for the parent namespace, thus
> > > +	 * perf tool will wrongly gather process info from its parent PID
> > > +	 * namespace.
> > > +	 *
> > > +	 * To avoid the confusion that the perf tool runs in a child PID
> > > +	 * namespace but it synthesizes thread info from its parent PID
> > > +	 * namespace, returns failure with warning.
> > > +	 */
> > > +	if (!nsinfo__is_in_root_namespace()) {
> > > +		pr_err("Perf runs in non-root PID namespace but it tries to ");
> > > +		pr_err("gather process info from its parent PID namespace.\n");
> > > +		pr_err("Please mount the proc file system properly, e.g. ");
> > > +		pr_err("add the option '--mount-proc' for unshare command.\n");
> > > +		return -EPERM;
> > > +	}
> > > +
> > >  	if (target__has_task(target))
> > >  		return perf_event__synthesize_thread_map(tool, threads, process, machine,
> > >  							 needs_mmap, data_mmap);
> > > 

-- 

- Arnaldo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ