lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZvH0-ny9gUzh_Jc7@google.com>
Date: Mon, 23 Sep 2024 16:08:42 -0700
From: Namhyung Kim <namhyung@...nel.org>
To: "Liang, Kan" <kan.liang@...ux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@...nel.org>,
	Ian Rogers <irogers@...gle.com>, Jiri Olsa <jolsa@...nel.org>,
	Adrian Hunter <adrian.hunter@...el.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...nel.org>, LKML <linux-kernel@...r.kernel.org>,
	linux-perf-users@...r.kernel.org,
	Josh Poimboeuf <jpoimboe@...nel.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	Indu Bhagat <indu.bhagat@...cle.com>,
	linux-toolchains@...r.kernel.org
Subject: Re: [RFC/PATCHSET 0/5] perf tools: Support deferred user callchains
 (v2)

Hi Kan,

On Wed, Sep 18, 2024 at 04:26:56PM -0400, Liang, Kan wrote:
> 
> 
> On 2024-09-17 6:28 p.m., Namhyung Kim wrote:
> > Hello,
> > 
> > This is a counterpart for Josh's kernel change v2 [1] to support deferred
> > user callchains.  The change is transparent and users should not notice
> > anything with the deferred callchains.
> > 
> >   $ perf record -g sleep 1
> > 
> > I added --[no-]merge-callchains option to control output of perf script.
> > You can verify it has the deferred callchains like this:
> > 
> >   $ perf script --no-merge-callchains
> >   perf     801 [000]    18.031793:          1 cycles:P:
> >           ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms])
> >           ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms])
> >           ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms])
> >           ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms])
> >           ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms])
> >           ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms])
> >           ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms])
> >           ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms])
> >           ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms])
> >           ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms])
> >           ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms])
> >           ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms])
> >           ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms])
> > 
> >   perf     801 [000]    18.031814: DEFERRED CALLCHAIN
> >                   7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6)
> > 
> >   ...
> > 
> > When the callchain is merged (it's the default) it'd look like below:
> > 
> >   $ perf script
> >   perf     801 [000]    18.031793:          1 cycles:P:
> >           ffffffff91a14c36 __intel_pmu_enable_all.isra.0+0x56 ([kernel.kallsyms])
> >           ffffffff91d373e9 perf_ctx_enable+0x39 ([kernel.kallsyms])
> >           ffffffff91d36af7 event_function+0xd7 ([kernel.kallsyms])
> >           ffffffff91d34222 remote_function+0x42 ([kernel.kallsyms])
> >           ffffffff91c1ebe1 generic_exec_single+0x61 ([kernel.kallsyms])
> >           ffffffff91c1edac smp_call_function_single+0xec ([kernel.kallsyms])
> >           ffffffff91d37a9d event_function_call+0x10d ([kernel.kallsyms])
> >           ffffffff91d33557 perf_event_for_each_child+0x37 ([kernel.kallsyms])
> >           ffffffff91d47324 _perf_ioctl+0x204 ([kernel.kallsyms])
> >           ffffffff91d47c43 perf_ioctl+0x33 ([kernel.kallsyms])
> >           ffffffff91e2f216 __x64_sys_ioctl+0x96 ([kernel.kallsyms])
> >           ffffffff9265f1ae do_syscall_64+0x9e ([kernel.kallsyms])
> >           ffffffff92800130 entry_SYSCALL_64+0xb0 ([kernel.kallsyms])
> >                   7fb5fc22034b __GI___ioctl+0x3b (/usr/lib/x86_64-linux-gnu/libc.so.6)
> > 
> >   ...
> > 
> > Notice that the last line and it has the __GI___ioctl in the same
> > callchain.  It should work with other tools like perf report.
> 
> 
> It seems it only works with perf report -D, when I test it on a
> non-hybrid machine.
> $perf record -e branches -g -c 3000000 ~/tchain_edit
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.397 MB perf.data ]
> $ perf report -D | tail -n 17
> 
> Aggregated stats:
>                TOTAL events:       8235
>                 MMAP events:         78  ( 0.9%)
>                 COMM events:          2  ( 0.0%)
>                 EXIT events:          1  ( 0.0%)
>               SAMPLE events:       4060  (49.3%)
>                MMAP2 events:          2  ( 0.0%)
>              KSYMBOL events:         12  ( 0.1%)
>            BPF_EVENT events:         12  ( 0.1%)
>   CALLCHAIN_DEFERRED events:       4060  (49.3%)
>       FINISHED_ROUND events:          3  ( 0.0%)
>             ID_INDEX events:          1  ( 0.0%)
>           THREAD_MAP events:          1  ( 0.0%)
>              CPU_MAP events:          1  ( 0.0%)
>            TIME_CONV events:          1  ( 0.0%)
>        FINISHED_INIT events:          1  ( 0.0%)
> $ perf report
> Error:
> The perf.data data has no samples!
> # To display the perf.data header info, please use
> --header/--header-only options.
> #
> 
> 
> On a hybrid machine, perf record errors out.
> 
> $perf record -g true
> [ perf record: Woken up 1 times to write data ]
> 0x58a8 [0x38]: failed to process type: 22 [Bad address]
> [ perf record: Captured and wrote 0.022 MB perf.data ]

Thanks for the test, I'll take a look what I missed.

Thanks,
Namhyung

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ