[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z6uWUMLOG9uBDb3L@x1>
Date: Tue, 11 Feb 2025 19:26:24 +0100
From: Arnaldo Carvalho de Melo <acme@...nel.org>
To: Howard Chu <howardchu95@...il.com>
Cc: Namhyung Kim <namhyung@...nel.org>, mark.rutland@....com,
alexander.shishkin@...ux.intel.com, jolsa@...nel.org,
irogers@...gle.com, adrian.hunter@...el.com,
kan.liang@...ux.intel.com, linux-perf-users@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v14 00/10] perf record --off-cpu: Dump off-cpu samples
directly
On Tue, Feb 11, 2025 at 07:02:04PM +0100, Arnaldo Carvalho de Melo wrote:
> When running just one workload:
>
> root@...ber:~# for a in `seq 10` ; do perf record --off-cpu sleep 1s ; perf script | grep offcpu-time ; done
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.037 MB perf.data (13 samples) ]
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.037 MB perf.data (16 samples) ]
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.037 MB perf.data (14 samples) ]
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.037 MB perf.data (12 samples) ]
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.037 MB perf.data (7 samples) ]
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.037 MB perf.data (7 samples) ]
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.037 MB perf.data (7 samples) ]
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.037 MB perf.data (7 samples) ]
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.037 MB perf.data (7 samples) ]
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.037 MB perf.data (7 samples) ]
> root@...ber:~#
>
> Have you tested it like that? Is this a known issue/current limitation?
>
> I'm reading the patches to see I understand this better.
Ok, if I go back and use the example Namhyung added to the commit
message introducing off-cpu profiling:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=edc41a1099c2d08ccfd4ed7d59688501e3749015
We get offcpu-time samples:
root@...ber:~# perf record --off-cpu perf bench sched messaging -l 1000
# Running 'sched/messaging' benchmark:
# 20 sender and receiver processes per group
# 10 groups == 400 processes run
Total time: 0.244 [sec]
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.182 MB perf.data (43600 samples) ]
root@...ber:~# perf script | grep offcpu-time | wc -l
867
root@...ber:~#
But they are all at the end:
sched-messaging 469373 17707.960698: 1055439 cpu_atom/cycles/P: ffffffff8a49a453 mod_objcg_state+0xb3 ([kernel.kallsyms])
sched-messaging 469373 17707.960944: 1057379 cpu_atom/cycles/P: ffffffff8b222cf9 native_queued_spin_lock_slowpath+0x209 ([kernel.kallsyms])
sched-messaging 469274 17707.961299: 6311728 cpu_core/cycles/P: 7fa751a510f1 __cxa_finalize+0x151 (/usr/lib64/libc.so.6)
:469374 469374 [000] 18446744069.414584: 43388310 offcpu-time:
7fa751b1c1c4 [unknown] ([unknown])
4fb0e8 [unknown] ([unknown])
4faa29 [unknown] ([unknown])
495700 [unknown] ([unknown])
495a1b [unknown] ([unknown])
413473 [unknown] ([unknown])
7fa751a38088 [unknown] ([unknown])
7fa751a3814b [unknown] ([unknown])
413ac5 [unknown] ([unknown])
:469518 469518 [000] 18446744069.414584: 22436206 offcpu-time:
7fa751b1c1c4 [unknown] ([unknown])
4fb0e8 [unknown] ([unknown])
4faa29 [unknown] ([unknown])
495700 [unknown] ([unknown])
495a1b [unknown] ([unknown])
413473 [unknown] ([unknown])
7fa751a38088 [unknown] ([unknown])
7fa751a3814b [unknown] ([unknown])
413ac5 [unknown] ([unknown])
:469663 469663 [000] 18446744069.414584: 180198700 offcpu-time:
7fa751b1d304 [unknown] ([unknown])
4faa29 [unknown] ([unknown])
495700 [unknown] ([unknown])
495a1b [unknown] ([unknown])
413473 [unknown] ([unknown])
7fa751a38088 [unknown] ([unknown])
7fa751a3814b [unknown] ([unknown])
413ac5 [unknown] ([unknown])
possibly the threshold?
Tried:
root@...ber:~# perf record --off-cpu --off-cpu-thresh 10 perf bench sched messaging -l 1000
Didn't help.
When doing a system wide session with that 10 threshold I get the
offcpu-time samples intermixed, i.e. not just at the end:
root@...ber:~# perf record -a --off-cpu --off-cpu-thresh 10 perf bench sched messaging -l 1000
sched-messaging 472324 [007] 18587.393976: 1371787 cpu_core/cycles/P: ffffffff8a7864d2 avc_has_perm+0x92 ([kernel.kallsyms])
swapper 0 [023] 18587.394111: 286156 cpu_atom/cycles/P: ffffffff8a16aacb do_idle+0x14b ([kernel.kallsyms])
sched-messaging 472532 [016] 18587.394210: 294672 cpu_atom/cycles/P: ffffffff8b4000ab entry_SYSCALL_64_safe_stack+0x0 ([kernel.kallsyms])
sched-messaging 472324 [007] 18587.394226: 1371903 cpu_core/cycles/P: ffffffff8a3f6308 copy_page_range+0x12a8 ([kernel.kallsyms])
podman 3935 [012] 18587.394310: 10106482 offcpu-time:
5630f6bf2457 [unknown] (/usr/bin/podman)
5630f6bc49e5 [unknown] (/usr/bin/podman)
5630f6bbb1f5 [unknown] (/usr/bin/podman)
5630f6bbb136 [unknown] (/usr/bin/podman)
5630f6beeb65 [unknown] (/usr/bin/podman)
7f4474032088 start_thread+0x3b8 (/usr/lib64/libc.so.6)
7f44740b5f8c clone3+0x2c (/usr/lib64/libc.so.6)
sched-messaging 472534 [016] 18587.394434: 279374 cpu_atom/cycles/P: ffffffff8a3826c9 get_callchain_entry+0x9 ([kernel.kallsyms])
sched-messaging 472324 [007] 18587.394476: 1371988 cpu_core/cycles/P: ffffffff8a43d264 __kmalloc_node_noprof+0xd4 ([kernel.kallsyms])
swapper 0 [016] 18587.394717: 276534 cpu_atom/cycles/P: ffffffff8a1d2ef4 rcu_preempt_deferred_qs+0x4 ([kernel.kallsyms])
<SNIP>
sched-messaging 472674 [013] 18587.414755: 6576247 cpu_core/cycles/P: ffffffff8b222d7d native_queued_spin_lock_slowpath+0x28d ([kernel.kallsyms])
sched-messaging 472625 [022] 18587.414760: 10699110 offcpu-time:
7f6b1a71d304 __GI___libc_write+0x14 (/usr/lib64/libc.so.6)
4fb0e8 bench_sched_messaging+0x218 (/home/acme/bin/perf)
4faa29 cmd_bench+0x2f9 (/home/acme/bin/perf)
495700 run_builtin+0x70 (/home/acme/bin/perf)
495a1b handle_internal_command+0xab (/home/acme/bin/perf)
413473 main+0x2f3 (/home/acme/bin/perf)
7f6b1a638088 __libc_start_call_main+0x78 (/usr/lib64/libc.so.6)
7f6b1a63814b __libc_start_main@@GLIBC_2.34+0x8b (/usr/lib64/libc.so.6)
413ac5 _start+0x25 (/home/acme/bin/perf)
sched-messaging 472623 [023] 18587.414763: 10911814 offcpu-time:
7f6b1a71d304 __GI___libc_write+0x14 (/usr/lib64/libc.so.6)
4fb0e8 bench_sched_messaging+0x218 (/home/acme/bin/perf)
4faa29 cmd_bench+0x2f9 (/home/acme/bin/perf)
495700 run_builtin+0x70 (/home/acme/bin/perf)
495a1b handle_internal_command+0xab (/home/acme/bin/perf)
413473 main+0x2f3 (/home/acme/bin/perf)
7f6b1a638088 __libc_start_call_main+0x78 (/usr/lib64/libc.so.6)
7f6b1a63814b __libc_start_main@@GLIBC_2.34+0x8b (/usr/lib64/libc.so.6)
413ac5 _start+0x25 (/home/acme/bin/perf)
<SNIP>
So I'm guessing that running just the 'sleep' workload we are missing
the sample at some point while with the 'perf bench' we have way more
samples, etc.
Also the intermixed entries have backtraces resolved to symbols, while
the ones at the end don't have, something else to investigate.
- Arnaldo
Powered by blists - more mailing lists