lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAP-5=fW_2iWEyOKao8MpMZWu7AQNX6-UKN1nEhr=mMxk0fUJKg@mail.gmail.com>
Date:   Wed, 6 Dec 2023 10:50:39 -0800
From:   Ian Rogers <irogers@...gle.com>
To:     Arnaldo Carvalho de Melo <acme@...nel.org>
Cc:     Ayush Jain <ayush.jain3@....com>,
        Sandipan Das <sandipan.das@....com>,
        linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org,
        peterz@...radead.org, Ingo Molnar <mingo@...nel.org>,
        mark.rutland@....com, alexander.shishkin@...ux.intel.com,
        Jiri Olsa <jolsa@...nel.org>,
        Namhyung Kim <namhyung@...nel.org>,
        Adrian Hunter <adrian.hunter@...el.com>, kjain@...ux.ibm.com,
        atrajeev@...ux.vnet.ibm.com, barnali@...ux.ibm.com,
        ananth.narayan@....com, ravi.bangoria@....com,
        santosh.shukla@....com
Subject: Re: [PATCH] perf test: Retry without grouping for all metrics test

On Wed, Dec 6, 2023 at 9:54 AM Arnaldo Carvalho de Melo <acme@...nel.org> wrote:
>
> Em Wed, Dec 06, 2023 at 08:35:23AM -0800, Ian Rogers escreveu:
> > On Wed, Dec 6, 2023 at 5:08 AM Arnaldo Carvalho de Melo <acme@...nel.org> wrote:
> > > Humm, I'm not being able to reproduce here the problem, before applying
> > > this patch:
>
> > Please don't apply the patch. The patch masks a bug in metrics/PMUs
>
> I didn't
>
> > and the proper fix was:
> > 8d40f74ebf21 perf vendor events amd: Fix large metrics
> > https://lore.kernel.org/r/20230706063440.54189-1-sandipan.das@amd.com
>
> that is upstream:
>
> ⬢[acme@...lbox perf-tools-next]$ git log tools/perf/pmu-events/arch/x86/amdzen1/recommended.json
> commit 8d40f74ebf217d3b9e9b7481721e6236b857cc55
> Author: Sandipan Das <sandipan.das@....com>
> Date:   Thu Jul 6 12:04:40 2023 +0530
>
>     perf vendor events amd: Fix large metrics
>
>     There are cases where a metric requires more events than the number of
>     available counters. E.g. AMD Zen, Zen 2 and Zen 3 processors have four
>     data fabric counters but the "nps1_die_to_dram" metric has eight events.
>
>     By default, the constituent events are placed in a group and since the
>     events cannot be scheduled at the same time, the metric is not computed.
>     The "all metrics" test also fails because of this.
>
>     Use the NO_GROUP_EVENTS constraint for such metrics which anyway expect
>     the user to run perf with "--metric-no-group".
>
>     E.g.
>
>       $ sudo perf test -v 101
>
>     Before:
>
>       101: perf all metrics test                                           :
>       --- start ---
>       test child forked, pid 37131
>       Testing branch_misprediction_ratio
>       Testing all_remote_links_outbound
>       Testing nps1_die_to_dram
>       Metric 'nps1_die_to_dram' not printed in:
>       Error:
>       Invalid event (dram_channel_data_controller_4) in per-thread mode, enable system wide with '-a'.
>       Testing macro_ops_dispatched
>       Testing all_l2_cache_accesses
>       Testing all_l2_cache_hits
>       Testing all_l2_cache_misses
>       Testing ic_fetch_miss_ratio
>       Testing l2_cache_accesses_from_l2_hwpf
>       Testing l2_cache_misses_from_l2_hwpf
>       Testing op_cache_fetch_miss_ratio
>       Testing l3_read_miss_latency
>       Testing l1_itlb_misses
>       test child finished with -1
>       ---- end ----
>       perf all metrics test: FAILED!
>
>     After:
>
>       101: perf all metrics test                                           :
>       --- start ---
>       test child forked, pid 43766
>       Testing branch_misprediction_ratio
>       Testing all_remote_links_outbound
>       Testing nps1_die_to_dram
>       Testing macro_ops_dispatched
>       Testing all_l2_cache_accesses
>       Testing all_l2_cache_hits
>       Testing all_l2_cache_misses
>       Testing ic_fetch_miss_ratio
>       Testing l2_cache_accesses_from_l2_hwpf
>       Testing l2_cache_misses_from_l2_hwpf
>       Testing op_cache_fetch_miss_ratio
>       Testing l3_read_miss_latency
>       Testing l1_itlb_misses
>       test child finished with 0
>       ---- end ----
>       perf all metrics test: Ok
>
>     Reported-by: Ayush Jain <ayush.jain3@....com>
>     Suggested-by: Ian Rogers <irogers@...gle.com>
>     Signed-off-by: Sandipan Das <sandipan.das@....com>
>     Acked-by: Ian Rogers <irogers@...gle.com>
>     Cc: Adrian Hunter <adrian.hunter@...el.com>
>     Cc: Alexander Shishkin <alexander.shishkin@...ux.intel.com>
>     Cc: Ananth Narayan <ananth.narayan@....com>
>     Cc: Ingo Molnar <mingo@...hat.com>
>     Cc: Jiri Olsa <jolsa@...nel.org>
>     Cc: Mark Rutland <mark.rutland@....com>
>     Cc: Namhyung Kim <namhyung@...nel.org>
>     Cc: Peter Zijlstra <peterz@...radead.org>
>     Cc: Ravi Bangoria <ravi.bangoria@....com>
>     Cc: Santosh Shukla <santosh.shukla@....com>
>     Link: https://lore.kernel.org/r/20230706063440.54189-1-sandipan.das@amd.com
>     Signed-off-by: Arnaldo Carvalho de Melo <acme@...hat.com
>
> > > Ian, I also stumbled on this:
>
> > > [root@...e ~]# perf stat -M dram_channel_data_controller_4
> > > Cannot find metric or group `dram_channel_data_controller_4'
> > > ^C
> > >  Performance counter stats for 'system wide':
>
> > >         284,908.91 msec cpu-clock                        #   32.002 CPUs utilized
> > >          6,485,456      context-switches                 #   22.763 K/sec
> > >                719      cpu-migrations                   #    2.524 /sec
> > >             32,800      page-faults                      #  115.125 /sec
>
> <SNIP>
>
> > > I.e. -M should bail out at that point (Cannot find metric or group `dram_channel_data_controller_4'), no?
>
> > We could. I suspect the code has always just not bailed out. I'll put
> > together a patch adding the bail out.
>
> Great, thanks,

Sent:
https://lore.kernel.org/lkml/20231206183533.972028-1-irogers@google.com/

Thanks,
Ian

> - Arnaldo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ