[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABPqkBQUuX32dzFAQ2pVWgohGx+fRj54_PrcuKBJgjapE4vc9w@mail.gmail.com>
Date: Thu, 5 Nov 2020 00:29:54 -0800
From: Stephane Eranian <eranian@...gle.com>
To: Namhyung Kim <namhyung@...nel.org>
Cc: Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Ingo Molnar <mingo@...nel.org>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Jiri Olsa <jolsa@...hat.com>,
LKML <linux-kernel@...r.kernel.org>,
Andi Kleen <ak@...ux.intel.com>,
Ian Rogers <irogers@...gle.com>,
Kan Liang <kan.liang@...ux.intel.com>,
Gabriel Marin <gmx@...gle.com>
Subject: Re: [RFC 0/2] perf/core: Invoke pmu::sched_task callback for cpu events
On Mon, Nov 2, 2020 at 6:52 AM Namhyung Kim <namhyung@...nel.org> wrote:
>
> Hello,
>
> It was reported that system-wide events with precise_ip set have a lot
> of unknown symbols on Intel machines. Depending on the system load I
> can see more than 30% of total symbols are not resolved (actually
> don't have DSO mappings).
>
> I found that it's only large PEBS is enabled - using call-graph or the
> frequency mode will disable it and have valid results. I've verified
> it by checking intel_pmu_pebs_sched_task() is called like below:
>
> # perf probe -a intel_pmu_pebs_sched_task
>
> # perf stat -a -e probe:intel_pmu_pebs_sched_task \
> > perf record -a -e cycles:ppp -c 100001 sleep 1
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 2.625 MB perf.data (10345 samples) ]
>
> Performance counter stats for 'system wide':
>
> 0 probe:intel_pmu_pebs_sched_task
>
> 2.157533991 seconds time elapsed
>
>
> Looking at the code, I found out that the pmu::sched_task callback was
> changed recently that it's called only for task events. So cpu events
> with large PEBS didn't flush the buffer and they are attributed to
> unrelated tasks later resulted in unresolved symbols.
>
> This patch reverts it and keeps the optimization for task events.
> While at it, I also found the context switch callback was not enabled
> for cpu events from the beginning. So I've added it too. With this
> applied, I can see the above callbacks are hit as expected and perf
> report has valid symbols.
>
This is a serious bug that impacts many kernel versions as soon as
multi-entry PEBS is activated by the kernel in system-wide mode.
I remember this was working in the past so it must have been broken by
some code refactoring or optimization or extension of sched_task
to other features. PEBS must be flushed on context switch in per-cpu
mode, otherwise you may report samples in locations that do not belong
to the process where they are processed in. PEBS does not tag samples
with PID/TID.
Powered by blists - more mailing lists