[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM9d7chzw4UeHHeXaMfPTiRdLbv7PbpK=xkgxMDojAxAc8y7Jg@mail.gmail.com>
Date: Thu, 12 Oct 2023 09:19:04 -0700
From: Namhyung Kim <namhyung@...nel.org>
To: Ingo Molnar <mingo@...nel.org>
Cc: Arnaldo Carvalho de Melo <acme@...nel.org>,
Jiri Olsa <jolsa@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Ian Rogers <irogers@...gle.com>,
Adrian Hunter <adrian.hunter@...el.com>,
LKML <linux-kernel@...r.kernel.org>,
linux-perf-users@...r.kernel.org,
Linus Torvalds <torvalds@...ux-foundation.org>,
Stephane Eranian <eranian@...gle.com>,
Masami Hiramatsu <mhiramat@...nel.org>,
linux-toolchains@...r.kernel.org,
linux-trace-devel@...r.kernel.org,
Ben Woodard <woodard@...hat.com>,
Joe Mario <jmario@...hat.com>,
Kees Cook <keescook@...omium.org>,
David Blaikie <blaikie@...gle.com>,
Xu Liu <xliuprof@...gle.com>,
Kan Liang <kan.liang@...ux.intel.com>,
Ravi Bangoria <ravi.bangoria@....com>
Subject: Re: [RFC 00/48] perf tools: Introduce data type profiling (v1)
Hi Ingo,
On Wed, Oct 11, 2023 at 11:03 PM Ingo Molnar <mingo@...nel.org> wrote:
>
>
> * Namhyung Kim <namhyung@...nel.org> wrote:
>
> > * How to use it
> >
> > To get precise memory access samples, users can use `perf mem record`
> > command to utilize those events supported by their architecture. Intel
> > machines would work best as they have dedicated memory access events but
> > they would have a filter to ignore low latency loads like less than 30
> > cycles (use --ldlat option to change the default value).
> >
> > # To get memory access samples in kernel for 1 second (on Intel)
> > $ sudo perf mem record -a -K --ldlat=4 -- sleep 1
> >
> > # Similar for the AMD (but it requires 6.3+ kernel for BPF filters)
> > $ sudo perf mem record -a --filter 'mem_op == load, ip > 0x8000000000000000' -- sleep 1
>
> BTW., it would be nice for 'perf mem record' to just do the right thing on
> whatever machine it is running on.
>
> Also, why are BPF filters required - due to the IP filtering of mem-load
> events?
Right, because AMD uses IBS for precise events and it doesn't
have a filtering feature.
>
> Could we perhaps add an IP filter to perf events to get this built-in?
> Perhaps attr->exclude_user would achieve something similar?
Unfortunately IBS doesn't support privilege filters IIUC. Maybe
we could add a general filtering logic in the NMI handler but I'm
afraid it can complicate the code and maybe slow it down a bit.
Probably it's ok to have only a simple privilege filter by IP range.
>
> > In perf report, it's just a matter of selecting new sort keys: 'type'
> > and 'typeoff'. The 'type' shows name of the data type as a whole while
> > 'typeoff' shows name of the field in the data type. I found it useful
> > to use it with --hierarchy option to group relevant entries in the same
> > level.
> >
> > $ sudo perf report -s type,typeoff --hierarchy --stdio
> > ...
> > #
> > # Overhead Data Type / Data Type Offset
> > # ........... ............................
> > #
> > 23.95% (stack operation)
> > 23.95% (stack operation) +0 (no field)
> > 23.43% (unknown)
> > 23.43% (unknown) +0 (no field)
> > 10.30% struct pcpu_hot
> > 4.80% struct pcpu_hot +0 (current_task)
> > 3.53% struct pcpu_hot +8 (preempt_count)
> > 1.88% struct pcpu_hot +12 (cpu_number)
> > 0.07% struct pcpu_hot +24 (top_of_stack)
> > 0.01% struct pcpu_hot +40 (softirq_pending)
> > 4.25% struct task_struct
> > 1.48% struct task_struct +2036 (rcu_read_lock_nesting)
> > 0.53% struct task_struct +2040 (rcu_read_unlock_special.b.blocked)
> > 0.49% struct task_struct +2936 (cred)
> > 0.35% struct task_struct +3144 (audit_context)
> > 0.19% struct task_struct +46 (flags)
> > 0.17% struct task_struct +972 (policy)
> > 0.15% struct task_struct +32 (stack)
> > 0.15% struct task_struct +8 (thread_info.syscall_work)
> > 0.10% struct task_struct +976 (nr_cpus_allowed)
> > 0.09% struct task_struct +2272 (mm)
> > ...
>
> This looks really useful!
:)
Thanks,
Namhyung
Powered by blists - more mailing lists