[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Ztea3dUZ-XSG2gfB@tassilo>
Date: Tue, 3 Sep 2024 16:25:17 -0700
From: Andi Kleen <ak@...ux.intel.com>
To: Ian Rogers <irogers@...gle.com>
Cc: Arnaldo Carvalho de Melo <acme@...nel.org>,
linux-perf-users <linux-perf-users@...r.kernel.org>,
Namhyung Kim <namhyung@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v10 1/4] Create source symlink in perf object dir
On Mon, Aug 26, 2024 at 04:53:01PM -0700, Ian Rogers wrote:
> On Mon, Aug 26, 2024, 4:34 PM Arnaldo Carvalho de Melo <acme@...nel.org> wrote:
> >
> > On Mon, Aug 26, 2024 at 08:27:43AM -0700, Ian Rogers wrote:
> > > On Mon, Aug 26, 2024 at 7:32 AM Arnaldo Carvalho de Melo
> > > <acme@...nel.org> wrote:
> > > >
> > > > On Sun, Aug 25, 2024 at 09:58:23AM -0700, Andi Kleen wrote:
> > > > > Arnaldo,
> > > >
> > > > > can you please apply the patchkit? This fixes a regression.
> > > >
> > > > First one was applied, was letting the others to be out there for a
> > > > while, I thought there were concerns about it, but I see Namhyung's Ack,
> > > > so applied.
> > >
> > > Can we not apply this? See comments on the thread. Basically we're
> >
> > And what about the reported segfault?
>
> It is better addressed by:
> https://lore.kernel.org/lkml/20240720074552.1915993-1-irogers@google.com/
I finally got around to test this other patch.
The reason for the feature is to get the metric for every individual
sampling interval as the most fine grained unit, as it was explained in the
original commit message:
perf script: Allow computing 'perf stat' style metrics
Add support for computing 'perf stat' style metrics in 'perf script'.
When using leader sampling we can get metrics ____for each sampling period___
by computing formulas over the values of the different group members.
This allows things like fine grained IPC tracking through sampling, much
more fine grained than with 'perf stat'.
The metric is still averaged over the sampling period, it is not just
for the sampling point.
...
Note the "for each sampling period" which is the key aspect.
With my version I get:
perf record -e '{cycles,instructions}:S' -a tcall
perf script -F +metric
perf 2061404 [000] 6395040.804752: 2687 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [000] 6395040.804752: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [000] 6395040.804752: metric: 0.15 insn per cycle
perf 2061404 [001] 6395040.804879: 2411 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [001] 6395040.804879: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [001] 6395040.804879: metric: 0.16 insn per cycle
perf 2061404 [002] 6395040.805000: 2245 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [002] 6395040.805000: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [002] 6395040.805000: metric: 0.18 insn per cycle
perf 2061404 [003] 6395040.805122: 2442 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [003] 6395040.805122: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [003] 6395040.805122: metric: 0.16 insn per cycle
perf 2061404 [004] 6395040.805241: 2208 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [004] 6395040.805241: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [004] 6395040.805241: metric: 0.18 insn per cycle
perf 2061404 [005] 6395040.805359: 2199 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [005] 6395040.805359: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [005] 6395040.805359: metric: 0.18 insn per cycle
perf 2061404 [006] 6395040.805479: 2269 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [006] 6395040.805479: 382 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [006] 6395040.805479: metric: 0.17 insn per cycle
perf 2061404 [007] 6395040.805596: 2215 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [007] 6395040.805596: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [007] 6395040.805596: metric: 0.18 insn per cycle
perf 2061404 [008] 6395040.805715: 2258 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [008] 6395040.805715: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [008] 6395040.805715: metric: 0.18 insn per cycle
perf 2061404 [009] 6395040.805835: 2293 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [009] 6395040.805835: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
You see there is one metric for every sampling period
But Ian's version generates this:
perf 2061404 [000] 6395040.804752: 2687 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [000] 6395040.804752: metric: 0.15 insn per cycle
perf 2061404 [000] 6395040.804752: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [000] 6395040.804752: metric: 0.07 insn per cycle
This is the only metric for "perf"
perf 2061404 [001] 6395040.804879: 2411 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [001] 6395040.804879: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [002] 6395040.805000: 2245 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [002] 6395040.805000: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [003] 6395040.805122: 2442 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [003] 6395040.805122: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [004] 6395040.805241: 2208 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [004] 6395040.805241: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [005] 6395040.805359: 2199 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [005] 6395040.805359: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [006] 6395040.805479: 2269 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [006] 6395040.805479: 382 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [007] 6395040.805596: 2215 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [007] 6395040.805596: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [008] 6395040.805715: 2258 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [008] 6395040.805715: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [009] 6395040.805835: 2293 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [009] 6395040.805835: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [010] 6395040.806013: 2159 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [010] 6395040.806013: 396 instructions: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
perf 2061404 [011] 6395040.806121: 3058 cycles: ffffffff990a579a native_write_msr+0xa ([kernel.kallsyms])
.... <lots more samples but no metrics for "perf" anymore">
There are some metrics for other processes, but I don't even know what logic it follows here
(as in what intervals actually get aggregated)
So yes maybe his implementation may be cleaner, but it simply doesn't solve the problem,
it implements something else.
-Andi
Powered by blists - more mailing lists