lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a4b93320-cf5d-9622-77fe-f51e1ec7f75b@arm.com>
Date:   Wed, 9 Aug 2023 14:06:30 +0100
From:   James Clark <james.clark@....com>
To:     John Garry <john.g.garry@...cle.com>,
        linux-perf-users@...r.kernel.org, irogers@...gle.com,
        renyu.zj@...ux.alibaba.com
Cc:     Will Deacon <will@...nel.org>, Mike Leach <mike.leach@...aro.org>,
        Leo Yan <leo.yan@...aro.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...nel.org>,
        Namhyung Kim <namhyung@...nel.org>,
        Adrian Hunter <adrian.hunter@...el.com>,
        Suzuki K Poulose <suzuki.poulose@....com>,
        Kajol Jain <kjain@...ux.ibm.com>,
        Kan Liang <kan.liang@...ux.intel.com>,
        Nick Forrington <nick.forrington@....com>,
        Eduard Zingerman <eddyz87@...il.com>,
        Rob Herring <robh@...nel.org>, linux-kernel@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org, coresight@...ts.linaro.org
Subject: Re: [PATCH v4 5/6] perf vendor events arm64: Update stall_slot
 workaround for N2 r0p3



On 08/08/2023 11:18, John Garry wrote:
> On 07/08/2023 15:20, James Clark wrote:
> 
> Hi James,
> 
> Thanks for the effort in doing this.
> 
>> N2 r0p3 doesn't require the workaround [1], so gating on (#slots - 5) no
>> longer works because all N2s have 5 slots. Add a new expression builtin
>> that identifies the need for the workaround correctly.
>>
>> [1]:
>> https://urldefense.com/v3/__https://gitlab.arm.com/telemetry-solution/telemetry-solution/-/blob/main/data/pmu/cpu/neoverse/neoverse-n2-r0p3.json__;!!ACWV5N9M2RV99hQ!Nx1xgWXXS9GBasNpOKQXHWKe8VwpRB0h8lAfOmwUmkkOQTjFqn2NswO8ti8vTcblUfAYN9NAtxqAh-sf0TEOvQ$
>> Signed-off-by: James Clark <james.clark@....com>
>> ---
>>   tools/perf/arch/arm64/util/pmu.c              | 21 +++++++++++++++++++
>>   .../arm64/arm/neoverse-n2-v2/metrics.json     |  8 +++----
>>   tools/perf/util/expr.c                        |  4 ++++
>>   tools/perf/util/pmu.c                         |  6 ++++++
>>   tools/perf/util/pmu.h                         |  1 +
>>   5 files changed, 36 insertions(+), 4 deletions(-)
>>
>> diff --git a/tools/perf/arch/arm64/util/pmu.c
>> b/tools/perf/arch/arm64/util/pmu.c
>> index 561de0cb6b95..30e2385a83cf 100644
>> --- a/tools/perf/arch/arm64/util/pmu.c
>> +++ b/tools/perf/arch/arm64/util/pmu.c
>> @@ -2,6 +2,7 @@
>>     #include <internal/cpumap.h>
>>   #include "../../../util/cpumap.h"
>> +#include "../../../util/header.h"
>>   #include "../../../util/pmu.h"
>>   #include "../../../util/pmus.h"
>>   #include <api/fs/fs.h>
>> @@ -62,3 +63,23 @@ double perf_pmu__cpu_slots_per_cycle(void)
>>         return slots ? (double)slots : NAN;
>>   }
>> +
>> +double perf_pmu__no_stall_errata(void)
> 
> While I like the approach of encoding the per-CPU support in the metric
> expression, I find that literal "no stall errata" is vague and could
> apply to any "stall errata" for any SoC for any architecture.
> 
> If we were going to do it this way, then we would need a more specific
> name, like perf_pmu__no_stall_errata_arm64_errata123456(), but that
> should not be in the core perf code.
> 
> Could we instead add a function to check cpuid and have something like
> this in the JSON:
> 
> "MetricExpr": "(op_retired / op_spec) * (1 - (stall_slot if
> (cpuid_less_than(410fd493)) else (stall_slot - cpu_cycles)) / (#slots *
> cpu_cycles))"
> 
> I'm currently figuring out how cpuid_less_than() would be implemented
> (I'm no great python wrangler), but it would be along the lines of what
> Ian added for "has_event" in
> https://lore.kernel.org/linux-perf-users/20230623151016.4193660-1-irogers@google.com/
> 
> Thanks,
> John

Yeah it looks like it could be done that way. Also, the way I added it,
it doesn't have access to the PMU type, it just does a generic
pmu__find_core_pmu() so won't work very well for heterogeneous systems.

If we're going to do a deeper modification of the expression parser like
with has_event() it might be possible to pass in the actual CPU ID that
the metric is running on which would be better.

I'll have a look.

James

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ