[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <da902ef2-df87-ed71-275b-f7b41d1afc9a@arm.com>
Date: Thu, 7 Apr 2022 16:24:35 +0100
From: German Gomez <german.gomez@....com>
To: Ali Saidi <alisaidi@...zon.com>, Leo Yan <leo.yan@...aro.org>
Cc: Nick.Forrington@....com, acme@...nel.org,
alexander.shishkin@...ux.intel.com, andrew.kilroy@....com,
benh@...nel.crashing.org, james.clark@....com,
john.garry@...wei.com, jolsa@...nel.org, kjain@...ux.ibm.com,
lihuafei1@...wei.com, linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org,
mark.rutland@....com, mathieu.poirier@...aro.org, mingo@...hat.com,
namhyung@...nel.org, peterz@...radead.org, will@...nel.org
Subject: Re: [PATCH v4 2/4] perf arm-spe: Use SPE data source for neoverse
cores
Hi,
On 31/03/2022 13:44, Leo Yan wrote:
> [...]
>>> I'd like to do this in a separate patch, but I have one other proposal. The
>>> Neoverse cores L2 is strictly inclusive of the L1, so even if it's in the L1,
>>> it's also in the L2. Given that the Graviton systems and afaik the Ampere
>>> systems don't have any cache between the L2 and the SLC, thus anything from
>>> PEER_CORE, LCL_CLSTR, or PEER_CLSTR would hit in the L2, perhaps we
>>> should just set L2 for these cases? German, are you good with this for now?
>> Sorry for the delay. I'd like to also check this with someone. I'll try
>> to get back asap. In the meantime, if this approach is also OK with Leo,
>> I think it would be fine by me.
Sorry for the delay. Yeah setting it to L2 indeed looks reasonable for
now. Somebody brought up the case of running SPE in a heterogeneousĀ
system, but also we think might be beyond the scope of this change.
One very minor nit though. Would you be ok with renaming LCL to LOCALĀ
and CLSTR to CLUSTER? I sometimes mistead the former as "LLC".
> Thanks for the checking internally. Let me just bring up my another
> thinking (sorry that my suggestion is float): another choice is we set
> ANY_CACHE as cache level if we are not certain the cache level, and
> extend snoop field to indicate the snooping logics, like:
>
> PERF_MEM_SNOOP_PEER_CORE
> PERF_MEM_SNOOP_LCL_CLSTR
> PERF_MEM_SNOOP_PEER_CLSTR
>
> Seems to me, we doing this is not only for cache level, it's more
> important for users to know the variant cost for involving different
> snooping logics.
>
> Thanks,
> Leo
I see there's been some more messages I need to catch up with. Is theĀ
intention to extend the PERF_MEM_* flags for this cset, or will it be
left for a later change?
In any case, I'd be keen to take another look at it and try to bring
some more eyes into this.
Thanks,
German
Powered by blists - more mailing lists