linux-kernel - Re: [PATCH] perf test: Retry without grouping for all metrics test

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1320e6e3-c029-2a8c-e8b7-2cfbb781518a@amd.com>
Date:   Wed, 14 Jun 2023 17:08:21 +0530
From:   Ayush Jain <ayush.jain3@....com>
To:     Sandipan Das <sandipan.das@....com>, linux-kernel@...r.kernel.org,
        linux-perf-users@...r.kernel.org
Cc:     peterz@...radead.org, mingo@...hat.com, acme@...nel.org,
        mark.rutland@....com, alexander.shishkin@...ux.intel.com,
        jolsa@...nel.org, namhyung@...nel.org, irogers@...gle.com,
        adrian.hunter@...el.com, kjain@...ux.ibm.com,
        atrajeev@...ux.vnet.ibm.com, barnali@...ux.ibm.com,
        ananth.narayan@....com, ravi.bangoria@....com,
        santosh.shukla@....com
Subject: Re: [PATCH] perf test: Retry without grouping for all metrics test

Hello Sandipan,

Thank you for this patch,

On 6/14/2023 2:37 PM, Sandipan Das wrote:
> There are cases where a metric uses more events than the number of
> counters. E.g. AMD Zen, Zen 2 and Zen 3 processors have four data fabric
> counters but the "nps1_die_to_dram" metric has eight events. By default,
> the constituent events are placed in a group. Since the events cannot be
> scheduled at the same time, the metric is not computed. The all metrics
> test also fails because of this.
> 
> Before announcing failure, the test can try multiple options for each
> available metric. After system-wide mode fails, retry once again with
> the "--metric-no-group" option.
> 
> E.g.
> 
>    $ sudo perf test -v 100
> 
> Before:
> 
>    100: perf all metrics test                                           :
>    --- start ---
>    test child forked, pid 672731
>    Testing branch_misprediction_ratio
>    Testing all_remote_links_outbound
>    Testing nps1_die_to_dram
>    Metric 'nps1_die_to_dram' not printed in:
>    Error:
>    Invalid event (dram_channel_data_controller_4) in per-thread mode, enable system wide with '-a'.
>    Testing macro_ops_dispatched
>    Testing all_l2_cache_accesses
>    Testing all_l2_cache_hits
>    Testing all_l2_cache_misses
>    Testing ic_fetch_miss_ratio
>    Testing l2_cache_accesses_from_l2_hwpf
>    Testing l2_cache_misses_from_l2_hwpf
>    Testing op_cache_fetch_miss_ratio
>    Testing l3_read_miss_latency
>    Testing l1_itlb_misses
>    test child finished with -1
>    ---- end ----
>    perf all metrics test: FAILED!
> 
> After:
> 
>    100: perf all metrics test                                           :
>    --- start ---
>    test child forked, pid 672887
>    Testing branch_misprediction_ratio
>    Testing all_remote_links_outbound
>    Testing nps1_die_to_dram
>    Testing macro_ops_dispatched
>    Testing all_l2_cache_accesses
>    Testing all_l2_cache_hits
>    Testing all_l2_cache_misses
>    Testing ic_fetch_miss_ratio
>    Testing l2_cache_accesses_from_l2_hwpf
>    Testing l2_cache_misses_from_l2_hwpf
>    Testing op_cache_fetch_miss_ratio
>    Testing l3_read_miss_latency
>    Testing l1_itlb_misses
>    test child finished with 0
>    ---- end ----
>    perf all metrics test: Ok
> 

Issue gets resolved after applying this patch

   $ ./perf test 102 -vvv
   $102: perf all metrics test                                           :
   $--- start ---
   $test child forked, pid 244991
   $Testing branch_misprediction_ratio
   $Testing all_remote_links_outbound
   $Testing nps1_die_to_dram
   $Testing all_l2_cache_accesses
   $Testing all_l2_cache_hits
   $Testing all_l2_cache_misses
   $Testing ic_fetch_miss_ratio
   $Testing l2_cache_accesses_from_l2_hwpf
   $Testing l2_cache_misses_from_l2_hwpf
   $Testing l3_read_miss_latency
   $Testing l1_itlb_misses
   $test child finished with 0
   $---- end ----
   $perf all metrics test: Ok

> Reported-by: Ayush Jain <ayush.jain3@....com>
> Signed-off-by: Sandipan Das <sandipan.das@....com>

Tested-by: Ayush Jain <ayush.jain3@....com>

> ---
>   tools/perf/tests/shell/stat_all_metrics.sh | 7 +++++++
>   1 file changed, 7 insertions(+)
> 
> diff --git a/tools/perf/tests/shell/stat_all_metrics.sh b/tools/perf/tests/shell/stat_all_metrics.sh
> index 54774525e18a..1e88ea8c5677 100755
> --- a/tools/perf/tests/shell/stat_all_metrics.sh
> +++ b/tools/perf/tests/shell/stat_all_metrics.sh
> @@ -16,6 +16,13 @@ for m in $(perf list --raw-dump metrics); do
>     then
>       continue
>     fi
> +  # Failed again, possibly there are not enough counters so retry system wide
> +  # mode but without event grouping.
> +  result=$(perf stat -M "$m" --metric-no-group -a sleep 0.01 2>&1)
> +  if [[ "$result" =~ ${m:0:50} ]]
> +  then
> +    continue
> +  fi
>     # Failed again, possibly the workload was too small so retry with something
>     # longer.
>     result=$(perf stat -M "$m" perf bench internals synthesize 2>&1)

Thanks & Regards,
Ayush Jain