[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1637567b-42df-57d5-2987-939ffbf451ef@arm.com>
Date: Wed, 16 Mar 2022 15:10:55 +0000
From: German Gomez <german.gomez@....com>
To: Leo Yan <leo.yan@...aro.org>, Ali Saidi <alisaidi@...zon.com>
Cc: acme@...nel.org, alexander.shishkin@...ux.intel.com,
andrew.kilroy@....com, benh@...nel.crashing.org,
james.clark@....com, john.garry@...wei.com, jolsa@...nel.org,
kjain@...ux.ibm.com, lihuafei1@...wei.com,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
linux-perf-users@...r.kernel.org, mark.rutland@....com,
mathieu.poirier@...aro.org, mingo@...hat.com, namhyung@...nel.org,
peterz@...radead.org, will@...nel.org, yao.jin@...ux.intel.com,
Nick.Forrington@....com
Subject: Re: [PATCH v2 2/2] perf mem: Support HITM for when mem_lvl_num is
used
On 16/03/2022 12:42, Leo Yan wrote:
> On Wed, Mar 16, 2022 at 11:43:52AM +0000, German Gomez wrote:
>
> [...]
>
>>>>> I had a look at the TRMs for the N1[1], V1[2] and N2[3] Neoverse cores
>>>>> (specifically the LL_CACHE_RD pmu events). If we were to assign a number
>>>>> to the system cache (assuming all caches are implemented):
>>>>>
>>>>> *For N1*, if L2 and L3 are implemented, system cache would follow at *L4*
>>>> To date no one has built 4 level though. Everyone has only built three.
>>> The N1SDP board advertises 4 levels (we use it regularly for testing perf patches)
>> That said, it's probably the odd one out.
>>
>> I'm not against assuming 3 levels. Later if there's is a strong need for L4, indeed we can go back and change it.
> Thanks for the info.
>
> For exploring cache hierarchy via sysFS is a good idea, the only one
> concern for me is: can we simply take the system cache as the same
> thing as the highest level cache? If so, I think another option is to
For Neoverse, it should be. LL_CACHE_RD pmu event says (if system cache is implemented):
* If CPUECTLR.EXTLLC is set: This event counts any cacheable read transaction which returns a data source of 'interconnect cache'.
> define a cache level as "PERF_MEM_LVLNUM_SYSTEM_CACHE" and extend the
> decoding code for support it.
>
> With PERF_MEM_LVLNUM_SYSTEM_CACHE, it can tell users clearly the data
> source from system cache, and users can easily map this info with the
> cache media on the working platform.
>
> In practice, I don't object to use cache level 3 at first step. At
> least this can meet the requirement at current stage.
Ok, I agree. I think for now it is a good compromise.
Detecting the caches seems like an additional/separate perf feature.
Thanks,
German
> Thanks,
> Leo
Powered by blists - more mailing lists