[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yj8Vd2vsw8yJ8b4x@kernel.org>
Date: Sat, 26 Mar 2022 10:30:31 -0300
From: Arnaldo Carvalho de Melo <acme@...nel.org>
To: Leo Yan <leo.yan@...aro.org>
Cc: Ali Saidi <alisaidi@...zon.com>, linux-kernel@...r.kernel.org,
linux-perf-users@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, german.gomez@....com,
benh@...nel.crashing.org, Nick.Forrington@....com,
alexander.shishkin@...ux.intel.com, andrew.kilroy@....com,
james.clark@....com, john.garry@...wei.com, jolsa@...nel.org,
kjain@...ux.ibm.com, lihuafei1@...wei.com, mark.rutland@....com,
mathieu.poirier@...aro.org, mingo@...hat.com, namhyung@...nel.org,
peterz@...radead.org, will@...nel.org
Subject: Re: [PATCH v4 4/4] perf mem: Support HITM for when mem_lvl_num is any
Em Sat, Mar 26, 2022 at 02:23:03PM +0800, Leo Yan escreveu:
> On Thu, Mar 24, 2022 at 06:33:23PM +0000, Ali Saidi wrote:
> > For loads that hit in a the LLC snoop filter and are fulfilled from a
> > higher level cache on arm64 Neoverse cores, it's not usually clear what
> > the true level of the cache the data came from (i.e. a transfer from a
> > core could come from it's L1 or L2). Instead of making an assumption of
> > where the line came from, add support for incrementing HITM if the
> > source is CACHE_ANY.
> >
> > Since other architectures don't seem to populate the mem_lvl_num field
> > here there shouldn't be a change in functionality.
> >
> > Signed-off-by: Ali Saidi <alisaidi@...zon.com>
> > Tested-by: German Gomez <german.gomez@....com>
> > Reviewed-by: German Gomez <german.gomez@....com>
> > ---
> > tools/perf/util/mem-events.c | 9 +++++++++
> > 1 file changed, 9 insertions(+)
> >
> > diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
> > index e5e405185498..084977cfebef 100644
> > --- a/tools/perf/util/mem-events.c
> > +++ b/tools/perf/util/mem-events.c
> > @@ -539,6 +539,15 @@ do { \
> > stats->ld_llchit++;
> > }
> >
> > + /*
> > + * A hit in another cores cache must mean a llc snoop
> > + * filter hit
> > + */
> > + if (lnum == P(LVLNUM, ANY_CACHE)) {
> > + if (snoop & P(SNOOP, HITM))
> > + HITM_INC(lcl_hitm);
> > + }
>
> This might break the memory profiling result for x86, see file
> arch/x86/events/intel/ds.c:
>
> 97 void __init intel_pmu_pebs_data_source_skl(bool pmem)
> 98 {
> 99 u64 pmem_or_l4 = pmem ? LEVEL(PMEM) : LEVEL(L4);
> ...
> 105 pebs_data_source[0x0d] = OP_LH | LEVEL(ANY_CACHE) | REM | P(SNOOP, HITM);
> 106 }
>
> Which means that it's possible that it's a remote access and the cache
> level is ANY_CACHE, it's good to add checking for bit
> PERF_MEM_REMOTE_REMOTE:
>
> u64 remote = data_src->mem_remote;
>
> /*
> * A hit in another cores cache must mean a llc snoop
> * filter hit
> */
> if (lnum == P(LVLNUM, ANY_CACHE) && remote != P(REMOTE, REMOTE)) {
> if (snoop & P(SNOOP, HITM))
> HITM_INC(lcl_hitm);
> }
>
> Appreciate German's reviewing and testing, and sorry I jumped in very
> late.
I have not published this on perf/core, its just in tmp.perf/core while
tests ran, so I'll remove this specific patch and rerun tests, thanks
for reviewing.
- Arnaldo
Powered by blists - more mailing lists