[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211224124343.GA2495042@leoy-ThinkPad-X240s>
Date: Fri, 24 Dec 2021 20:43:43 +0800
From: Leo Yan <leo.yan@...aro.org>
To: Arnaldo Carvalho de Melo <acme@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...hat.com>,
Namhyung Kim <namhyung@...nel.org>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>,
Martin KaFai Lau <kafai@...com>,
Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
John Fastabend <john.fastabend@...il.com>,
KP Singh <kpsingh@...nel.org>, Ian Rogers <irogers@...gle.com>,
linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org,
James Clark <James.Clark@....com>
Subject: Re: [PATCH v1] perf synthetic-events: Return error if procfs isn't
mounted for PID namespaces
+ James, sorry for spamming.
On Fri, Dec 24, 2021 at 08:40:13PM +0800, Leo Yan wrote:
> For perf recording, it retrieves process info by iterating nodes in proc
> fs. If we run perf in a non-root PID namespace with command:
>
> # unshare --fork --pid perf record -e cycles -a -- test_program
>
> ... in this case, unshare command creates a child PID namespace and
> launches perf tool in it, but the issue is the proc fs is not mounted
> for the non-root PID namespace, this leads to the perf tool gathering
> process info from its parent PID namespace.
>
> We can use below command to observe the process nodes under proc fs:
>
> # unshare --pid --fork ls /proc
> 1 137 1968 2128 3 342 48 62 78 crypto kcore net uptime
> 10 138 2 2142 30 35 49 63 8 devices keys pagetypeinfo version
> 11 139 20 2143 304 36 50 64 82 device-tree key-users partitions vmallocinfo
> 12 14 2011 22 305 37 51 65 83 diskstats kmsg self vmstat
> 128 140 2038 23 307 39 52 656 84 driver kpagecgroup slabinfo zoneinfo
> 129 15 2074 24 309 4 53 67 9 execdomains kpagecount softirqs
> 13 16 2094 241 31 40 54 68 asound fb kpageflags stat
> 130 164 2096 242 310 41 55 69 buddyinfo filesystems loadavg swaps
> 131 17 2098 25 317 42 56 70 bus fs locks sys
> 132 175 21 26 32 43 57 71 cgroups interrupts meminfo sysrq-trigger
> 133 179 2102 263 329 44 58 75 cmdline iomem misc sysvipc
> 134 1875 2103 27 330 45 59 76 config.gz ioports modules thread-self
> 135 19 2117 29 333 46 6 77 consoles irq mounts timer_list
> 136 1941 2121 298 34 47 60 773 cpuinfo kallsyms mtd tty
>
> So it shows many existed tasks, since unshared command has not mounted
> the proc fs for the new created PID namespace, it still accesses the
> proc fs of the root PID namespace. This leads to two prominent issues:
>
> - Firstly, PID values are mismatched between thread info and samples.
> The gathered thread info are coming from the proc fs of the root PID
> namespace, but samples record its PID from the child PID namespace.
>
> - The second issue is profiled program 'test_program' returns its forked
> PID number from the child PID namespace, perf tool wrongly uses this
> PID number to retrieve the process info via the proc fs of the root
> PID namespace.
>
> To avoid issues, we need to mount proc fs for the child PID namespace
> with the option '--mount-proc' when use unshare command:
>
> # unshare --fork --pid --mount-proc perf record -e cycles -a -- test_program
>
> Conversely, when the proc fs of the root PID namespace is used by child
> namespace, perf tool can detect the multiple PID levels and
> nsinfo__is_in_root_namespace() returns false, this patch reports error
> for this case:
>
> # unshare --fork --pid perf record -e cycles -a -- test_program
> Couldn't synthesize bpf events.
> Perf runs in non-root PID namespace but it tries to gather process info from its parent PID namespace.
> Please mount the proc file system properly, e.g. add the option '--mount-proc' for unshare command.
>
> Signed-off-by: Leo Yan <leo.yan@...aro.org>
> ---
> tools/perf/util/synthetic-events.c | 19 +++++++++++++++++++
> 1 file changed, 19 insertions(+)
>
> diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
> index 198982109f0f..cf7800347f77 100644
> --- a/tools/perf/util/synthetic-events.c
> +++ b/tools/perf/util/synthetic-events.c
> @@ -1784,6 +1784,25 @@ int __machine__synthesize_threads(struct machine *machine, struct perf_tool *too
> perf_event__handler_t process, bool needs_mmap,
> bool data_mmap, unsigned int nr_threads_synthesize)
> {
> + /*
> + * When perf runs in non-root PID namespace, and the namespace's proc FS
> + * is not mounted, nsinfo__is_in_root_namespace() returns false.
> + * In this case, the proc FS is coming for the parent namespace, thus
> + * perf tool will wrongly gather process info from its parent PID
> + * namespace.
> + *
> + * To avoid the confusion that the perf tool runs in a child PID
> + * namespace but it synthesizes thread info from its parent PID
> + * namespace, returns failure with warning.
> + */
> + if (!nsinfo__is_in_root_namespace()) {
> + pr_err("Perf runs in non-root PID namespace but it tries to ");
> + pr_err("gather process info from its parent PID namespace.\n");
> + pr_err("Please mount the proc file system properly, e.g. ");
> + pr_err("add the option '--mount-proc' for unshare command.\n");
> + return -EPERM;
> + }
> +
> if (target__has_task(target))
> return perf_event__synthesize_thread_map(tool, threads, process, machine,
> needs_mmap, data_mmap);
> --
> 2.25.1
>
Powered by blists - more mailing lists