lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fXfyd9b7Ns-SL5F+iffc7oy4NFHBsT3oj3CRMbBa1QCfg@mail.gmail.com>
Date: Thu, 24 Oct 2024 00:07:46 -0700
From: Ian Rogers <irogers@...gle.com>
To: Namhyung Kim <namhyung@...nel.org>
Cc: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>, 
	Arnaldo Carvalho de Melo <acme@...nel.org>, Mark Rutland <mark.rutland@....com>, 
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>, 
	Adrian Hunter <adrian.hunter@...el.com>, Kan Liang <kan.liang@...ux.intel.com>, 
	Ravi Bangoria <ravi.bangoria@....com>, Weilin Wang <weilin.wang@...el.com>, 
	Yoshihiro Furudera <fj5100bi@...itsu.com>, James Clark <james.clark@...aro.org>, 
	Athira Jajeev <atrajeev@...ux.vnet.ibm.com>, Howard Chu <howardchu95@...il.com>, 
	Oliver Upton <oliver.upton@...ux.dev>, Changbin Du <changbin.du@...wei.com>, 
	Ze Gao <zegao2021@...il.com>, Junhao He <hejunhao3@...wei.com>, linux-kernel@...r.kernel.org, 
	linux-perf-users@...r.kernel.org
Subject: Re: [PATCH v6 0/5] Hwmon PMUs

On Wed, Oct 23, 2024 at 8:06 PM Namhyung Kim <namhyung@...nel.org> wrote:
>
> Hi Ian,
>
> On Tue, Oct 22, 2024 at 11:06:18AM -0700, Ian Rogers wrote:
> > Following the convention of the tool PMU, create a hwmon PMU that
> > exposes hwmon data for reading. For example, the following shows
> > reading the CPU temperature and 2 fan speeds alongside the uncore
> > frequency:
> > ```
> > $ perf stat -e temp_cpu,fan1,hwmon_thinkpad/fan2/,tool/num_cpus_online/ -M UNCORE_FREQ -I 1000
> >      1.001153138              52.00 'C   temp_cpu
> >      1.001153138              2,588 rpm  fan1
> >      1.001153138              2,482 rpm  hwmon_thinkpad/fan2/
> >      1.001153138                  8      tool/num_cpus_online/
> >      1.001153138      1,077,101,397      UNC_CLOCK.SOCKET                 #     1.08 UNCORE_FREQ
> >      1.001153138      1,012,773,595      duration_time
> > ...
> > ```
> >
> > Additional data on the hwmon events is in perf list:
> > ```
> > $ perf list
> > ...
> > hwmon:
> > ...
> >   temp_core_0 OR temp2
> >        [Temperature in unit coretemp named Core 0. crit=100'C,max=100'C crit_alarm=0'C. Unit:
> >         hwmon_coretemp]
> > ...
> > ```
> >
> > v6: Add string.h #include for issue reported by kernel test robot.
> > v5: Fix asan issue in parse_hwmon_filename caught by a TMA metric.
> > v4: Drop merged patches 1 to 10. Separate adding the hwmon_pmu from
> >     the update to perf_pmu to use it. Try to make source of literal
> >     strings clearer via named #defines. Fix a number of GCC warnings.
> > v3: Rebase, add Namhyung's acked-by to patches 1 to 10.
> > v2: Address Namhyung's review feedback. Rebase dropping 4 patches
> >     applied by Arnaldo, fix build breakage reported by Arnaldo.
> >
> > Ian Rogers (5):
> >   tools api io: Ensure line_len_out is always initialized
> >   perf hwmon_pmu: Add a tool PMU exposing events from hwmon in sysfs
> >   perf pmu: Add calls enabling the hwmon_pmu
> >   perf test: Add hwmon "PMU" test
> >   perf docs: Document tool and hwmon events
>
> I think the patch 2 can be easily splitted into core and other parts
> like dealing with aliases and units.  I believe it'd be helpful for
> others (like me) to understand how it works.
>
> Please take a look at 'perf/hwmon-pmu' branch in:
>
>   https://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Thanks Namhyung but I'm not really seeing this making anything simpler
and I can see significant new bugs. Your new patch:
https://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git/commit/?h=perf/hwmon-pmu&id=85c78b5bf71fb3e67ae815f7b2d044648fa08391
Has taken about 40% out of patch 2, but done so by splitting function
declarations from their definitions, enum declarations from any use,
etc. It also adds in code like:

snprintf(buf, sizeof(buf), "%s_input", evsel->name);

but this would be a strange thing to do. The evsel->name is rewritten
by fallback logic, so cycles may become cycles:u if kernel profiling
is restricted. This is why we have metric-id in the evsel as we cannot
rely on the evsel->name not mutating when looking up events for the
sake of metrics. Using the name as part of a sysfs filename lookup
doesn't make sense to me as now the evsel fallback logic can break a
hwmon event. In the original patch the code was:

snprintf(buf, sizeof(buf), "%s%d_input", hwmon_type_strs[key.type], key.num);

where those two values are constants and key.type and key.num both
values embedded in the config value the evsel fallback logic won't
change. But bringing in the code that does that basically brings in
all of the rest of patch 2.

So the patch is adding a PMU that looks broken, so rather than
simplifying things it just creates a broken intermediate state and
should that be fixed for the benefit of bisects?
It also complicates understanding as the declarations of functions and
enums have kernel-doc, but now the definitions of enums and functions
are split apart. For me, to understand the code I'd want to squash the
patches back together again so I could see a declaration with its
definition.

Thanks,
Ian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ