[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YTiBqbxe7ieqY2OE@hirez.programming.kicks-ass.net>
Date: Wed, 8 Sep 2021 11:26:01 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Michael Ellerman <mpe@...erman.id.au>
Cc: Kajol Jain <kjain@...ux.ibm.com>, linuxppc-dev@...ts.ozlabs.org,
linux-kernel@...r.kernel.org, mingo@...hat.com, acme@...nel.org,
jolsa@...nel.org, namhyung@...nel.org,
linux-perf-users@...r.kernel.org, ak@...ux.intel.com,
maddy@...ux.ibm.com, atrajeev@...ux.vnet.ibm.com,
rnsastry@...ux.ibm.com, yao.jin@...ux.intel.com, ast@...nel.org,
daniel@...earbox.net, songliubraving@...com,
kan.liang@...ux.intel.com, mark.rutland@....com,
alexander.shishkin@...ux.intel.com, paulus@...ba.org
Subject: Re: [PATCH 1/3] perf: Add macros to specify onchip L2/L3 accesses
On Wed, Sep 08, 2021 at 05:17:53PM +1000, Michael Ellerman wrote:
> Kajol Jain <kjain@...ux.ibm.com> writes:
> > diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> > index f92880a15645..030b3e990ac3 100644
> > --- a/include/uapi/linux/perf_event.h
> > +++ b/include/uapi/linux/perf_event.h
> > @@ -1265,7 +1265,9 @@ union perf_mem_data_src {
> > #define PERF_MEM_LVLNUM_L2 0x02 /* L2 */
> > #define PERF_MEM_LVLNUM_L3 0x03 /* L3 */
> > #define PERF_MEM_LVLNUM_L4 0x04 /* L4 */
> > -/* 5-0xa available */
> > +#define PERF_MEM_LVLNUM_OC_L2 0x05 /* On Chip L2 */
> > +#define PERF_MEM_LVLNUM_OC_L3 0x06 /* On Chip L3 */
>
> The obvious use for 5 is for "L5" and so on.
>
> I'm not sure adding new levels is the best idea, because these don't fit
> neatly into the hierarchy, they are off to the side.
>
>
> I wonder if we should use the remote field.
>
> ie. for another core's L2 we set:
>
> mem_lvl = PERF_MEM_LVL_L2
> mem_remote = 1
This mixes APIs (see below), IIUC the correct usage would be something
like: lvl_num=L2 remote=1
> Which would mean "remote L2", but not remote enough to be
> lvl = PERF_MEM_LVL_REM_CCE1.
>
> It would be printed by the existing tools/perf code as "Remote L2", vs
> "Remote cache (1 hop)", which seems OK.
>
>
> ie. we'd be able to express:
>
> Current core's L2: LVL_L2
> Other core's L2: LVL_L2 | REMOTE
> Other chip's L2: LVL_REM_CCE1 | REMOTE
>
> And similarly for L3.
>
> I think that makes sense? Unless people think remote should be reserved
> to mean on another chip, though we already have REM_CCE1 for that.
IIRC the PERF_MEM_LVL_* namespace is somewhat depricated in favour of
the newer composite PERF_MEM_{LVLNUM_,REMOTE_,SNOOPX_} fields. Of
course, ABIs being what they are, we get to support both :/ But I'm not
sure mixing them is a great idea.
Also, clearly this could use a comment...
The 'new' composite doesnt have a hops field because the hardware that
nessecitated that change doesn't report it, but we could easily add a
field there.
Suppose we add, mem_hops:3 (would 6 hops be too small?) and the
corresponding PERF_MEM_HOPS_{NA, 0..6}
Then I suppose you can encode things like:
L2 - local L2
L2 | REMOTE - remote L2 at an unspecified distance (NA)
L2 | REMOTE | HOPS_0 - remote L2 on the same node
L2 | REMOTE | HOPS_1 - remote L2 on a node 1 removed
Would that work?
Powered by blists - more mailing lists