[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aDeH_QfWHPa9KT1S@google.com>
Date: Wed, 28 May 2025 15:02:37 -0700
From: Namhyung Kim <namhyung@...nel.org>
To: Ian Rogers <irogers@...gle.com>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>,
Adrian Hunter <adrian.hunter@...el.com>,
Kan Liang <kan.liang@...ux.intel.com>,
James Clark <james.clark@...aro.org>,
"Masami Hiramatsu (Google)" <mhiramat@...nel.org>,
Ravi Bangoria <ravi.bangoria@....com>, Leo Yan <leo.yan@....com>,
Yujie Liu <yujie.liu@...el.com>,
Graham Woodward <graham.woodward@....com>,
Howard Chu <howardchu95@...il.com>,
Weilin Wang <weilin.wang@...el.com>,
Dmitry Vyukov <dvyukov@...gle.com>, Andi Kleen <ak@...ux.intel.com>,
Thomas Falcon <thomas.falcon@...el.com>,
Matt Fleming <matt@...dmodwrite.com>,
Chun-Tse Shao <ctshao@...gle.com>, Ben Gainey <ben.gainey@....com>,
Song Liu <song@...nel.org>, linux-kernel@...r.kernel.org,
linux-perf-users@...r.kernel.org, Kajol Jain <kjain@...ux.ibm.com>,
Athira Rajeev <atrajeev@...ux.vnet.ibm.com>
Subject: Re: [PATCH v3 1/3] perf sample: Remove arch notion of sample parsing
On Wed, May 28, 2025 at 01:38:03PM -0700, Ian Rogers wrote:
> On Wed, May 28, 2025 at 1:09 PM Namhyung Kim <namhyung@...nel.org> wrote:
> >
> > On Wed, May 28, 2025 at 11:27:06AM -0700, Ian Rogers wrote:
> > > On Wed, May 28, 2025 at 11:11 AM Namhyung Kim <namhyung@...nel.org> wrote:
> > > >
> > > > On Wed, May 21, 2025 at 02:15:24PM -0700, Ian Rogers wrote:
> > > > > On Wed, May 21, 2025 at 1:27 PM Namhyung Kim <namhyung@...nel.org> wrote:
> > > > > >
> > > > > > On Wed, May 21, 2025 at 09:53:15AM -0700, Ian Rogers wrote:
> > > > > > > By definition arch sample parsing and synthesis will inhibit certain
> > > > > > > kinds of cross-platform record then analysis (report, script,
> > > > > > > etc.). Remove arch_perf_parse_sample_weight and
> > > > > > > arch_perf_synthesize_sample_weight replacing with a common
> > > > > > > implementation. Combine perf_sample p_stage_cyc and retire_lat to
> > > > > > > capture the differing uses regardless of compiled for architecture.
> > > > > >
> > > > > > Can you please do this without renaming? It can be a separate patch but
> > > > > > I think we can just leave it.
> > > > >
> > > > > It is not clear what the use of the union is. Presumably it is a
> > > > > tagged union but there is no tag as the union value to use is implied
> > > > > by either being built on x86_64 or powerpc. The change removes the
> > > > > notion of this code being built for x86_64 or powerpc and so the union
> > > > > value to use isn't clear (e.g. should arm use p_stage_cyc or
> > > > > retire_lat from the union), hence combining to show that it could be
> > > > > one or the other. The code could be:
> > > > > ```
> > > > > #ifdef __x86_64__
> > > > > u16 p_stage_cyc;
> > > > > #elif defined(powerpc)
> > > > > u16 retire_lat;
> > > > > #endif
> > > > > ```
> > > > > but this isn't cross-platform.
> > > >
> > > > Right, we probably don't want it.
> > > >
> > > >
> > > > > The change in hist.h of
> > > > > ```
> > > > > @@ -255,7 +255,7 @@ struct hist_entry {
> > > > > u64 code_page_size;
> > > > > u64 weight;
> > > > > u64 ins_lat;
> > > > > - u64 p_stage_cyc;
> > > > > + u64 p_stage_cyc_or_retire_lat;
> > > > > ```
> > > > > could be a follow up CL, but then we lose something of what the field
> > > > > is holding given the value is just a copy of that same named value in
> > > > > perf_sample. The code only inserts 34 lines and so the churn of doing
> > > > > that seemed worse than having the change in a single patch for
> > > > > clarity.
> > > >
> > > > Assuming other archs can add something later, we won't rename the field
> > > > again. So I can live with the ugly union fields. If we really want to
> > > > rename it, I prefer calling it just 'weight3' and let the archs handle
> > > > the display name only.
> > >
> > > But that's my point (or in other words maybe you've missed my point) .
> > > Regardless of arch we should display p_stage_cyc if processing a
> > > perf.data file from a PowerPC as determined from the perf_env in the
> > > perf.data file, or retire_lat if processing a perf.data file from x86.
> >
> > Agreed.
> >
> >
> > > The arch of the perf build is entirely irrelevant and calling the
> > > variable an opaque weight3 will require something that will need to be
> > > disambiguate it elsewhere in the code. The goal in variable names
> > > should be to be intention revealing, which I think
> > > p_stage_cyc_or_retire_lat is doing better than weight3, which is
> > > something of a regression from the existing but inaccurate
> > > p_stage_cyc.
> >
> > Yeah, but I worried if it would end up with
> > 'p_stage_cyc_or_retire_lat_or_something_else' later.
>
> Perhaps it should be:
> ```
> union {
> u16 raw;
> u16 p_stage_cyc;
> u16 retire_lat;
> } weight3;
> ```
> to try to best capture this. `xyz.weight3.raw` when the PowerPC or x86
> use isn't known, etc.
Looks better. But based on the offline discussion, it'd be better to
just use 'u16 weight3'.
> In the histogram code the u16 is a u64.
Yep, because histogram entry aggregates sample weights.
Thanks,
Namhyung
Powered by blists - more mailing lists