linux-kernel - Re: [PATCH 1/3] perf: Add macros to specify onchip L2/L3 accesses

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87ilzbmt7i.fsf@mpe.ellerman.id.au>
Date:   Wed, 08 Sep 2021 17:17:53 +1000
From:   Michael Ellerman <mpe@...erman.id.au>
To:     Kajol Jain <kjain@...ux.ibm.com>, linuxppc-dev@...ts.ozlabs.org,
        linux-kernel@...r.kernel.org, peterz@...radead.org,
        mingo@...hat.com, acme@...nel.org, jolsa@...nel.org,
        namhyung@...nel.org, linux-perf-users@...r.kernel.org,
        ak@...ux.intel.com
Cc:     maddy@...ux.ibm.com, atrajeev@...ux.vnet.ibm.com,
        kjain@...ux.ibm.com, rnsastry@...ux.ibm.com,
        yao.jin@...ux.intel.com, ast@...nel.org, daniel@...earbox.net,
        songliubraving@...com, kan.liang@...ux.intel.com,
        mark.rutland@....com, alexander.shishkin@...ux.intel.com,
        paulus@...ba.org
Subject: Re: [PATCH 1/3] perf: Add macros to specify onchip L2/L3 accesses

Kajol Jain <kjain@...ux.ibm.com> writes:
> Add couple of new macros to represent onchip L2 and onchip L3 accesses.

It would be "on chip". But I think this needs much more explanation,
this is a generic header so these definitions need to make sense, and
have an understood meaning, across all architectures.

I think most people are going to read "on chip" as differentiating
between an L2/L3 that is "on chip" vs "off chip".

But the case you're trying to express is "another core's L2/L3 on the
same chip as the CPU", vs "the current CPU's L2/L3".

> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index f92880a15645..030b3e990ac3 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -1265,7 +1265,9 @@ union perf_mem_data_src {
>  #define PERF_MEM_LVLNUM_L2	0x02 /* L2 */
>  #define PERF_MEM_LVLNUM_L3	0x03 /* L3 */
>  #define PERF_MEM_LVLNUM_L4	0x04 /* L4 */
> -/* 5-0xa available */
> +#define PERF_MEM_LVLNUM_OC_L2	0x05 /* On Chip L2 */
> +#define PERF_MEM_LVLNUM_OC_L3	0x06 /* On Chip L3 */

The obvious use for 5 is for "L5" and so on.

I'm not sure adding new levels is the best idea, because these don't fit
neatly into the hierarchy, they are off to the side.

I wonder if we should use the remote field.

ie. for another core's L2 we set:

  mem_lvl = PERF_MEM_LVL_L2
  mem_remote = 1

Which would mean "remote L2", but not remote enough to be
lvl = PERF_MEM_LVL_REM_CCE1.

It would be printed by the existing tools/perf code as "Remote L2", vs
"Remote cache (1 hop)", which seems OK.

ie. we'd be able to express:

  Current core's L2: LVL_L2
  Other core's L2:   LVL_L2 | REMOTE
  Other chip's L2:   LVL_REM_CCE1 | REMOTE

And similarly for L3.

I think that makes sense? Unless people think remote should be reserved
to mean on another chip, though we already have REM_CCE1 for that.

cheers