linux-kernel - Re: [PATCH v2 3/3] perf sort: Use perf_env to set arch sort keys and header

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAP-5=fWXgEMf7ZesTg6+JJ_nV1s1ZM4xPm2R1umXSYkM208ZmQ@mail.gmail.com>
Date: Fri, 23 May 2025 08:51:55 -0700
From: Ian Rogers <irogers@...gle.com>
To: "Liang, Kan" <kan.liang@...ux.intel.com>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>, 
	Arnaldo Carvalho de Melo <acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>, 
	Mark Rutland <mark.rutland@....com>, 
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>, 
	Adrian Hunter <adrian.hunter@...el.com>, James Clark <james.clark@...aro.org>, 
	"Masami Hiramatsu (Google)" <mhiramat@...nel.org>, Ravi Bangoria <ravi.bangoria@....com>, Leo Yan <leo.yan@....com>, 
	Yujie Liu <yujie.liu@...el.com>, Graham Woodward <graham.woodward@....com>, 
	Howard Chu <howardchu95@...il.com>, Weilin Wang <weilin.wang@...el.com>, 
	Dmitry Vyukov <dvyukov@...gle.com>, Andi Kleen <ak@...ux.intel.com>, 
	Thomas Falcon <thomas.falcon@...el.com>, Matt Fleming <matt@...dmodwrite.com>, 
	Chun-Tse Shao <ctshao@...gle.com>, Ben Gainey <ben.gainey@....com>, Song Liu <song@...nel.org>, 
	linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org
Subject: Re: [PATCH v2 3/3] perf sort: Use perf_env to set arch sort keys and header

On Fri, May 23, 2025 at 7:51 AM Liang, Kan <kan.liang@...ux.intel.com> wrote:
>
> On 2025-05-21 3:19 p.m., Ian Rogers wrote:
> > On Wed, May 21, 2025 at 11:14 AM Liang, Kan <kan.liang@...ux.intel.com> wrote:
> >>
> >> On 2025-05-21 12:16 p.m., Ian Rogers wrote:
> >>>>> diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
> >>>>> index 7b6cde87d2af..13ef0d188a96 100644
> >>>>> --- a/tools/perf/builtin-top.c
> >>>>> +++ b/tools/perf/builtin-top.c
> >>>>> @@ -1747,7 +1747,14 @@ int cmd_top(int argc, const char **argv)
> >>>>>
> >>>>>       setup_browser(false);
> >>>>>
> >>>>> -     if (setup_sorting(top.evlist) < 0) {
> >>>>> +     top.session = perf_session__new(/*data=*/NULL, /*tool=*/NULL);
> >>>>> +     if (IS_ERR(top.session)) {
> >>>>> +             status = PTR_ERR(top.session);
> >>>>> +             top.session = NULL;
> >>>>> +             goto out_delete_evlist;
> >>>>> +     }
> >>>>> +
> >>>>> +     if (setup_sorting(top.evlist, &top.session->header.env) < 0) {
> >>>> I doubt a valide env can be got in perf_session__new(), since there is
> >>>> no perf.data in perf top.
> >>>> Maybe just need to invoke the perf_env__raw_arch() instead to fill the
> >>>> env->arch.
> >>> I think the current code is making things harder than it should be, we
> >>> should work away from perf_env__arch and strings, instead using EM_
> >>> values which we can default to EM_HOST avoiding any runtime costs.
> >>> Looking at perf_env__arch:
> >>> ```
> >>> const char *perf_env__arch(struct perf_env *env)
> >>> {
> >>>         char *arch_name;
> >>>
> >>>         if (!env || !env->arch) { /* Assume local operation */
> >>>                 static struct utsname uts = { .machine[0] = '\0', };
> >>>                 if (uts.machine[0] == '\0' && uname(&uts) < 0)
> >>>                         return NULL;
> >>>                 arch_name = uts.machine;
> >>>         } else
> >>>                 arch_name = env->arch;
> >>>
> >>>         return normalize_arch(arch_name);
> >>> }
> >>> ```
> >>> in this case env->arch == NULL and so the uname machine will be used.
> >>> For perf_env__raw_arch the behavior is similar but it populates the
> >>> env:
> >>> ```
> >>> static int perf_env__read_arch(struct perf_env *env)
> >>> {
> >>>         struct utsname uts;
> >>>
> >>>         if (env->arch)
> >>>                 return 0;
> >>>
> >>>         if (!uname(&uts))
> >>>                 env->arch = strdup(uts.machine);
> >>>
> >>>         return env->arch ? 0 : -ENOMEM;
> >>> }
> >>>
> >>> const char *perf_env__raw_arch(struct perf_env *env)
> >>> {
> >>>         return env && !perf_env__read_arch(env) ? env->arch : "unknown";
> >>> }
> >>> ```
> >>> Aside from caching the arch, the main difference is that
> >>> normalize_arch isn't called. Not having normalize_arch means the code
> >>> in arch_support_sort_key and arch_perf_header_entry would need to
> >>> handle strings "ppc" as well as "powerpc", "i386" as well as "x86",
> >>> etc. As I'd prefer not handle all those cases I think the way the code
> >>> is is best given how the env code is currently structured.
> >>
> >> Right. The perf_env__raw_arch() doesn't improve anything.
> >> But I still don't like &top.session->header.env.
> >> Because I don't think you can get any useful information from
> >> top.session->header.env. It just brings confusions. (It seems an env is
> >> retrieved, but it is actually not.)
> >
> > Well there's a certain consistency in using the session env to set up
> > the sorting, etc. This pre-exists this change with nearly every
> > builtin-* file doing `symbol__init(&session->header.env);`. perf top
> > does `symbol__init(NULL);`:
> > https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/builtin-top.c?h=perf-tools-next#n1811
> > but the code now has lazy initialization patterns and handling NULL as
> > a special case of meaning host machine:
> > https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/symbol.c?h=perf-tools-next#n2350
> >
> >> In the perf top, &perf_env is used for the existing cases. If any env
> >> fields are not available, perf_env__read_XXX() is invoked to get the
> >> information.
> >> I think we may follow the existing usage, e.g.,
> >> setup_sorting(top.evlist, &perf_env).
> >
> > So using the global perf_env rather than NULL feels preferable but I
> > think the global perf_env should be deleted. Whenever I see the global
> > perf_env in use I think the code has a bug as the perf_env should be
> > coming from the session or the machine. The global perf_env can have
> > no meaning for cases like `perf diff` where more than one
> > file/header/env is open at a time. The global perf_env variable's
> > existence encourages bad or broken code, so deleting it should avoid
> > errors in code. Another place these issues can occur is with TPEBS
> > where we're maintaining multiple sessions for sampling alongside
> > counting.
> >
> >> Alternatively, it looks like the perf top doesn't support --weight. The
> >> env->arch should never be used. If so, a NULL can be used as well, E.g.,
> >> setup_sorting(top.evlist, NULL).
> >
> > So I don't like NULL because then we have lazy initialization of host
> > data and NULL meaning use host. I don't like the global perf_env
> > variable as it has a code smell about it. I think moving the session
> > initialization earlier in perf top so its env, although unpopulated,
> > can be used is consistent with `perf report` - this is consistent with
> > `perf top` being `perf record` glued together with `perf report`. So I
> > think the change here is the smallest and most sensible.
> >
> > Longer term let's delete the global perf_env variable,  perf_env__arch
> > should be switched to a perf_env__e_machine as then we can avoid uname
> > calls just to determine the machine architecture, etc.
> >
>
> I'm fine with the session's env, as long as there is a consistent env
> source in the perf top. Because in the recent perf top fixes, we
> randomly pick the env source. Thomas's patch used the global env, but
> this one chose the session's env. It brings confusions.
> https://lore.kernel.org/lkml/20250513231813.13846-2-thomas.falcon@intel.com/
>
> Could you please send a clean up patch to address the inconsistency?

Will do, thanks Kan!

Ian