lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220204124849.GB2040169@leoy-ThinkPad-X240s>
Date:   Fri, 4 Feb 2022 20:48:49 +0800
From:   Leo Yan <leo.yan@...aro.org>
To:     James Clark <james.clark@....com>
Cc:     Arnaldo Carvalho de Melo <acme@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...hat.com>,
        Namhyung Kim <namhyung@...nel.org>,
        Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        Andrii Nakryiko <andrii@...nel.org>,
        Martin KaFai Lau <kafai@...com>,
        Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
        John Fastabend <john.fastabend@...il.com>,
        KP Singh <kpsingh@...nel.org>, Ian Rogers <irogers@...gle.com>,
        linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v1] perf synthetic-events: Return error if procfs isn't
 mounted for PID namespaces

On Fri, Jan 07, 2022 at 02:27:57PM +0000, James Clark wrote:
> On 24/12/2021 12:40, Leo Yan wrote:
> > For perf recording, it retrieves process info by iterating nodes in proc
> > fs.  If we run perf in a non-root PID namespace with command:
> > 
> >   # unshare --fork --pid perf record -e cycles -a -- test_program
> > 
> > ... in this case, unshare command creates a child PID namespace and
> > launches perf tool in it, but the issue is the proc fs is not mounted
> > for the non-root PID namespace, this leads to the perf tool gathering
> > process info from its parent PID namespace.
> > 
> 
> I had some concerns that this would prevent use of perf in docker, but docker
> does mount /proc so it is still working. And you've added a warning about how
> to fix the unshare command so I think this is ok.
> 
> Reviewed-by: James Clark <james.clark@....com>

Thanks a lot for review, James.

Arnaldo, Jiri, could you take a look for this patch?  Thanks!

Leo

> > We can use below command to observe the process nodes under proc fs:
> > 
> >   # unshare --pid --fork ls /proc
> > 1    137   1968  2128  3    342  48  62   78	     crypto	  kcore        net	      uptime
> > 10   138   2	 2142  30   35	 49  63   8	     devices	  keys	       pagetypeinfo   version
> > 11   139   20	 2143  304  36	 50  64   82	     device-tree  key-users    partitions     vmallocinfo
> > 12   14    2011  22    305  37	 51  65   83	     diskstats	  kmsg	       self	      vmstat
> > 128  140   2038  23    307  39	 52  656  84	     driver	  kpagecgroup  slabinfo       zoneinfo
> > 129  15    2074  24    309  4	 53  67   9	     execdomains  kpagecount   softirqs
> > 13   16    2094  241   31   40	 54  68   asound     fb		  kpageflags   stat
> > 130  164   2096  242   310  41	 55  69   buddyinfo  filesystems  loadavg      swaps
> > 131  17    2098  25    317  42	 56  70   bus	     fs		  locks        sys
> > 132  175   21	 26    32   43	 57  71   cgroups    interrupts   meminfo      sysrq-trigger
> > 133  179   2102  263   329  44	 58  75   cmdline    iomem	  misc	       sysvipc
> > 134  1875  2103  27    330  45	 59  76   config.gz  ioports	  modules      thread-self
> > 135  19    2117  29    333  46	 6   77   consoles   irq	  mounts       timer_list
> > 136  1941  2121  298   34   47	 60  773  cpuinfo    kallsyms	  mtd	       tty
> > 
> > So it shows many existed tasks, since unshared command has not mounted
> > the proc fs for the new created PID namespace, it still accesses the
> > proc fs of the root PID namespace.  This leads to two prominent issues:
> > 
> > - Firstly, PID values are mismatched between thread info and samples.
> >   The gathered thread info are coming from the proc fs of the root PID
> >   namespace, but samples record its PID from the child PID namespace.
> > 
> > - The second issue is profiled program 'test_program' returns its forked
> >   PID number from the child PID namespace, perf tool wrongly uses this
> >   PID number to retrieve the process info via the proc fs of the root
> >   PID namespace.
> > 
> > To avoid issues, we need to mount proc fs for the child PID namespace
> > with the option '--mount-proc' when use unshare command:
> > 
> >   # unshare --fork --pid --mount-proc perf record -e cycles -a -- test_program
> > 
> > Conversely, when the proc fs of the root PID namespace is used by child
> > namespace, perf tool can detect the multiple PID levels and
> > nsinfo__is_in_root_namespace() returns false, this patch reports error
> > for this case:
> > 
> >   # unshare --fork --pid perf record -e cycles -a -- test_program
> >   Couldn't synthesize bpf events.
> >   Perf runs in non-root PID namespace but it tries to gather process info from its parent PID namespace.
> >   Please mount the proc file system properly, e.g. add the option '--mount-proc' for unshare command.
> > 
> > Signed-off-by: Leo Yan <leo.yan@...aro.org>
> > ---
> >  tools/perf/util/synthetic-events.c | 19 +++++++++++++++++++
> >  1 file changed, 19 insertions(+)
> > 
> > diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
> > index 198982109f0f..cf7800347f77 100644
> > --- a/tools/perf/util/synthetic-events.c
> > +++ b/tools/perf/util/synthetic-events.c
> > @@ -1784,6 +1784,25 @@ int __machine__synthesize_threads(struct machine *machine, struct perf_tool *too
> >  				  perf_event__handler_t process, bool needs_mmap,
> >  				  bool data_mmap, unsigned int nr_threads_synthesize)
> >  {
> > +	/*
> > +	 * When perf runs in non-root PID namespace, and the namespace's proc FS
> > +	 * is not mounted, nsinfo__is_in_root_namespace() returns false.
> > +	 * In this case, the proc FS is coming for the parent namespace, thus
> > +	 * perf tool will wrongly gather process info from its parent PID
> > +	 * namespace.
> > +	 *
> > +	 * To avoid the confusion that the perf tool runs in a child PID
> > +	 * namespace but it synthesizes thread info from its parent PID
> > +	 * namespace, returns failure with warning.
> > +	 */
> > +	if (!nsinfo__is_in_root_namespace()) {
> > +		pr_err("Perf runs in non-root PID namespace but it tries to ");
> > +		pr_err("gather process info from its parent PID namespace.\n");
> > +		pr_err("Please mount the proc file system properly, e.g. ");
> > +		pr_err("add the option '--mount-proc' for unshare command.\n");
> > +		return -EPERM;
> > +	}
> > +
> >  	if (target__has_task(target))
> >  		return perf_event__synthesize_thread_map(tool, threads, process, machine,
> >  							 needs_mmap, data_mmap);
> > 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ