lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAP-5=fV55Tn0j0+ceKZeWheCuMw4RHHvVp0qOu3csZT=ojP_nQ@mail.gmail.com>
Date: Sat, 1 Feb 2025 22:48:17 -0800
From: Ian Rogers <irogers@...gle.com>
To: David Laight <david.laight.linux@...il.com>
Cc: Thomas Richter <tmricht@...ux.ibm.com>, LKML <linux-kernel@...r.kernel.org>, 
	linux-s390@...r.kernel.org, 
	linux-perf-users <linux-perf-users@...r.kernel.org>, 
	Arnaldo Carvalho de Melo <acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>, 
	Alexander Gordeev <agordeev@...ux.ibm.com>, Vasily Gorbik <gor@...ux.ibm.com>, 
	sumanth Korikkar <sumanthk@...ux.ibm.com>, Heiko Carstens <hca@...ux.ibm.com>
Subject: Re: [PATCH] perf test: Fix perf test 11 hwmon endianess issue

On Sat, Feb 1, 2025, 9:14 AM David Laight <david.laight.linux@...il.com> wrote:
>
> On Sat, 1 Feb 2025 08:12:38 -0800
> Ian Rogers <irogers@...gle.com> wrote:
>
> > On Sat, Feb 1, 2025 at 3:34 AM David Laight
> > <david.laight.linux@...il.com> wrote:
> > >
> > > On Fri, 31 Jan 2025 12:24:00 +0100
> > > Thomas Richter <tmricht@...ux.ibm.com> wrote:
> > >
> > > > perf test 11 hwmon fails on s390 with this error
> > > >
> > > >  # ./perf test -Fv 11
> > > >  --- start ---
> > > >  ---- end ----
> > > >  11.1: Basic parsing test             : Ok
> > > >  --- start ---
> > > >  Testing 'temp_test_hwmon_event1'
> > > >  Using CPUID IBM,3931,704,A01,3.7,002f
> > > >  temp_test_hwmon_event1 -> hwmon_a_test_hwmon_pmu/temp_test_hwmon_event1/
> > > >  FAILED tests/hwmon_pmu.c:189 Unexpected config for
> > > >     'temp_test_hwmon_event1', 292470092988416 != 655361
> > > >  ---- end ----
> > > >  11.2: Parsing without PMU name       : FAILED!
> > > >  --- start ---
> > > >  Testing 'hwmon_a_test_hwmon_pmu/temp_test_hwmon_event1/'
> > > >  FAILED tests/hwmon_pmu.c:189 Unexpected config for
> > > >     'hwmon_a_test_hwmon_pmu/temp_test_hwmon_event1/',
> > > >     292470092988416 != 655361
> > > >  ---- end ----
> > > >  11.3: Parsing with PMU name          : FAILED!
> > > >  #
> > > >
> > > > The root cause is in member test_event::config which is initialized
> > > > to 0xA0001 or 655361. During event parsing a long list event parsing
> > > > functions are called and end up with this gdb call stack:
> > > ...
> > > > However member key::type_and_num is defined as union and bit field:
> > > >
> > > >    union hwmon_pmu_event_key {
> > > >         long type_and_num;
> > > >         struct {
> > > >                 int num :16;
> > > >                 enum hwmon_type type :8;
> > > >         };
> > > >    };
> > >
> > > That is entirely horrid.
> >
> > It also has the side-effect that if you initialize the struct bits in
> > the type_and_num will be undefined and sanitizers will try to trip you
> > up.
> >
> > > I'm surprised this even compiles:
> > > static size_t hwmon_pmu__event_hashmap_hash(long key, void *ctx __maybe_unused)
> > > {
> > >         return ((union hwmon_pmu_event_key)key).type_and_num;
> > > }
> > > It has to be just 'return key', but I'm not sure what the hashmap code is doing.
> > >
> > > AFAICT the code is just trying to generate a value for the hashmap to hash on?
> > > Why not just use (type << 16 | num) instead of 'trying to be clever' with a union?
> >
> > The purpose wasn't so much to be 'clever' as you say. Perf event
> > configs, which is where type_and_num lives in the events, have their
> > bitfields described either by convention or by files in
> > /sys/devices/<pmu>/format. On an Intel laptop:
> > ```
> > $ cat /sys/devices/cpu/format/inv
> > config:23
> > ```
> > So I can go:
> > ```
> > $ perf stat -e 'inst_retired.any/inv=1/' true
> >
> > Performance counter stats for 'true':
> >
> >         1,069,756      inst_retired.any/inv=1/
> >
> >       0.001539265 seconds time elapsed
> >
> >       0.001719000 seconds user
> >       0.000000000 seconds sys
> > ```
> > and the configuration of my inst_retired.any event now has bit 23 in
> > the config set. Just as the format files try to break down what the
> > bitfields within a config u64 are doing, the union was trying to do
> > similar. Making an attempt to have the value be intention revealing
> > which the shift and masking may lose, for example, by making it look
> > like bits in the num could overlap with and modify the type.
>
> Except that is really doesn't work for big-endian at all.
> The 'fix' doesn't even really paper over the cracks properly.
>
> I'm not even sure the hashing works on 64bit BE.
> (assuming the int:n are also BE - but that is a different ABI option.)
> The 'num' and 'type' will be in the high 24 bits of the 64bit 'long'.
> If we assume the rest of the union is actually initialised the other
> bits are all zero.
> And I guess that the hash function is masking with 2^n-1 - so always gets zero.

No, as is typical in any hash table you always do a secondary hash:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/hashmap.h#n15

Thanks,
Ian

>         David
>
> >
> > Thanks,
> > Ian
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ