lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <770b7f48-2a5a-1c1b-b26d-bb0cfc3a1b46@linux.alibaba.com>
Date:   Thu, 8 Jun 2023 18:11:03 +0800
From:   Jing Zhang <renyu.zj@...ux.alibaba.com>
To:     Robin Murphy <robin.murphy@....com>
Cc:     James Clark <james.clark@....com>,
        Mike Leach <mike.leach@...aro.org>,
        Leo Yan <leo.yan@...aro.org>,
        Mark Rutland <mark.rutland@....com>,
        Ilkka Koskinen <ilkka@...amperecomputing.com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Jiri Olsa <jolsa@...nel.org>,
        Namhyung Kim <namhyung@...nel.org>,
        Adrian Hunter <adrian.hunter@...el.com>,
        linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        linux-perf-users@...r.kernel.org,
        Zhuo Song <zhuo.song@...ux.alibaba.com>,
        Ian Rogers <irogers@...gle.com>, Will Deacon <will@...nel.org>,
        Shuai Xue <xueshuai@...ux.alibaba.com>,
        John Garry <john.g.garry@...cle.com>
Subject: Re: [PATCH v3 2/7] perf metric: Event "Compat" value supports
 matching multiple identifiers



在 2023/6/7 上午12:27, Robin Murphy 写道:
> On 02/06/2023 5:20 pm, John Garry wrote:
>> On 01/06/2023 09:58, Jing Zhang wrote:
>>>>  From checking the driver, it seems that we have model names "arm_cmn600" and "arm_cmn650". Are you saying that "arm_cmn600X" would match for those? I am most curious about how "arm_cmn600X" matches "arm_cmn650".
>>>>
>>> Hi John,
>>>
>>>  From patch #1 we have identifiers "arm_cmn600_0" and "arm_cmn650_0" etc. 
>>
>> ok, I see. Your idea for the cmn driver HW identifier format is odd to me. Your HW identifier is a mix of the HW IP model name (from arm_cmn_device_data.model_name) with some the kernel revision identifier (from cmn_revision). The kernel revision identifier is an enum, and I don't think that it is a good idea to expose enum values through sysfs files.
>>
>> I assume that there is some ordering requirement for cmn_revision, considering that we have equality operations on the revision in the driver.
> 
> That enum does actually follow the revision identifiers as provided by the hardware (see CMN_CFGM_PID2_REVISION), so I don't see any major issue with putting it into user ABI. And TBH I think I would prefer to just use a numeric value rather than have to maintain yet more tables of strings which given the usage model here would effectively only mangle a matchable value into a different matchable value anyway.
> 
> I am inclined to agree that the mix between part driver-generated-string, part hardware-value looks a little funky. I still need to check with the hardware team exactly how the part number field from PERIPH_ID_0/1 is "configuration-dependent", and whether there might actually be a chance of using that as well.
> 

Thanks Robin. So should we wait to confirm the configuration of the PERIPH_ID_0/1 field before pushing this patch? Or is it
acceptable to use "cmn600_r0p0" as an identifier? Looking forward to your suggestion.

> One nagging doubt that remains for metrics are any baked-in assumptions which may not always simply depend on the product version - for instance it happens to be the case currently that everything has a fixed flit size of 256 bits, hence the magic "32" in the bandwidth calculations, but if that ever became configurable in some future product, we may potentially have a problem guaranteeing a meaningful calculation.
> 

In this case, we may use "literal" to solve it later. It can use variables in bandwidth calculations. For example,
"#slots" can get the value from the file of the current architecture and use it in the metric.

>>> The identifier consists of model_name and revision.
>>> The compatible value "arm_cmn600;arm_cmn650" can match the identifier "arm_cmn600_0" or "arm_cmn650_0". Maybe the message log
>>> is not clear enough.
>>>
>>> For example in patch #3 the metric "slc_miss_rate" is a generic metric for cmn-any. So we can define:
>>>
>>> +    {
>>> +        "MetricName": "slc_miss_rate",
>>> +        "BriefDescription": "The system level cache miss rate include.",
>>> +        "MetricGroup": "arm_cmn",
>>> +        "MetricExpr": "hnf_cache_miss / hnf_slc_sf_cache_access",
>>> +        "ScaleUnit": "100%",
>>> +        "Unit": "arm_cmn",
>>> +        "Compat": "arm_cmn600;arm_cmn650;arm_cmn700;arm_ci700"
>>> +    },
>>>
>>>
>>> It can match identifiers "arm_cmn600_{0,1,2..X}" or "arm_cmn650_{0,1,2..X}" or "arm_cmn700_{0,1,2..X}" or "arm_ci700_{ 0,1,2..X}".
>>> In other words, it can match all identifiers prefixed with “arm_cmn600” or “arm_cmn650” or “arm_cmn700” or “arm_ci700”.
>>>
>>> If a new model arm_cmn driver with identifier "arm_cmn750_0", it will not be matched, but if a new revision arm_cmn driver with identifier
>>> "arm_cmn700_4", it can be matched.
>>
>> OK, I see what you mean. My confusion came about though your commit message on this same patch, which did not mention cmn650. I assumed that the example event which you were describing was supported for arm_cmn650 and you intentionally omitted it.
>>
>>>
>>>
>>>>> Tokens in Unit field are delimited by ';'.
>>>> Thanks for taking a stab at solving this problem.
>>>>
>>>> I have to admit that I am not the biggest fan of having multiple values to match in the "Compat" value possibly for every event. It doesn't really scale.
>>>>
>>>> I would hope that there are at least some events which we are guaranteed to always be present. From what Robin said on the v2 series, for the implementations which we care about, events are generally added per subsequent version. So we should have some base set of fixed events.
> 
> Note that there's a slight difference between "present" and "valid", e.g. in the current driver-internal aliases, all MTSX events are marked CMN_ANY, meaning they're considered valid on any CMN configuration with an MTSX node, regardless of model. The events don't exist on CMN-600 or CMN-650, but that's because the MTSX itself wasn't a thing yet, so for simplicity we don't have to bother considering the events invalid when we know they will always be non-present and thus filtered anyway.
> 
>>>> If we are confident that we have a fixed set of base set of events, can we ensure that those events would not require this compat string which needs each version explicitly stated?
>>>>
>>> If we are sure that some events will always exist in subsequent versions, we can set the Compat field to "arm_cmn;arm_ci". After that,
>>> whether it is a different model or a different revision of the cmn PMU, it will be compatible. That is, it matches all whose identifier
>>> is prefixed with “arm_cmn” or “arm_ci”.
>>
>> Sure, we could do something like that. Or if we are super-confident that every model and rev will support some event, then we can change perf tool to just not check Compat for that event.
> 
> The majority of events have stayed unchanged since the introduction of their respective node type, so assuming we already have a basic match on the PMU name to know which JSON to be looking at in the first place, I'd imagine the Compat field could be optional, and only needed for events which first appear in a subsequent revision or model, or the fiddly cases like where DVM node events got entirely rewritten in CMN-650.
> 

OK, thanks. Maybe we first need to confirm how to set the identifier format in the driver. Then it will be clearer how to implement Compat matching.


Thanks,
Jing

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ