lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 09 Sep 2021 22:45:54 +1000
From:   Michael Ellerman <mpe@...erman.id.au>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Kajol Jain <kjain@...ux.ibm.com>, linuxppc-dev@...ts.ozlabs.org,
        linux-kernel@...r.kernel.org, mingo@...hat.com, acme@...nel.org,
        jolsa@...nel.org, namhyung@...nel.org,
        linux-perf-users@...r.kernel.org, ak@...ux.intel.com,
        maddy@...ux.ibm.com, atrajeev@...ux.vnet.ibm.com,
        rnsastry@...ux.ibm.com, yao.jin@...ux.intel.com, ast@...nel.org,
        daniel@...earbox.net, songliubraving@...com,
        kan.liang@...ux.intel.com, mark.rutland@....com,
        alexander.shishkin@...ux.intel.com, paulus@...ba.org
Subject: Re: [PATCH 1/3] perf: Add macros to specify onchip L2/L3 accesses

Peter Zijlstra <peterz@...radead.org> writes:
> On Wed, Sep 08, 2021 at 05:17:53PM +1000, Michael Ellerman wrote:
>> Kajol Jain <kjain@...ux.ibm.com> writes:
>
>> > diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
>> > index f92880a15645..030b3e990ac3 100644
>> > --- a/include/uapi/linux/perf_event.h
>> > +++ b/include/uapi/linux/perf_event.h
>> > @@ -1265,7 +1265,9 @@ union perf_mem_data_src {
>> >  #define PERF_MEM_LVLNUM_L2	0x02 /* L2 */
>> >  #define PERF_MEM_LVLNUM_L3	0x03 /* L3 */
>> >  #define PERF_MEM_LVLNUM_L4	0x04 /* L4 */
>> > -/* 5-0xa available */
>> > +#define PERF_MEM_LVLNUM_OC_L2	0x05 /* On Chip L2 */
>> > +#define PERF_MEM_LVLNUM_OC_L3	0x06 /* On Chip L3 */
>> 
>> The obvious use for 5 is for "L5" and so on.
>> 
>> I'm not sure adding new levels is the best idea, because these don't fit
>> neatly into the hierarchy, they are off to the side.
>> 
>> 
>> I wonder if we should use the remote field.
>> 
>> ie. for another core's L2 we set:
>> 
>>   mem_lvl = PERF_MEM_LVL_L2
>>   mem_remote = 1
>
> This mixes APIs (see below), IIUC the correct usage would be something
> like: lvl_num=L2 remote=1

Aha, I was wondering how lvl and lvl_num were supposed to interact.

>> Which would mean "remote L2", but not remote enough to be
>> lvl = PERF_MEM_LVL_REM_CCE1.
>> 
>> It would be printed by the existing tools/perf code as "Remote L2", vs
>> "Remote cache (1 hop)", which seems OK.
>> 
>> 
>> ie. we'd be able to express:
>> 
>>   Current core's L2: LVL_L2
>>   Other core's L2:   LVL_L2 | REMOTE
>>   Other chip's L2:   LVL_REM_CCE1 | REMOTE
>> 
>> And similarly for L3.
>> 
>> I think that makes sense? Unless people think remote should be reserved
>> to mean on another chip, though we already have REM_CCE1 for that.
>
> IIRC the PERF_MEM_LVL_* namespace is somewhat depricated in favour of
> the newer composite PERF_MEM_{LVLNUM_,REMOTE_,SNOOPX_} fields. Of
> course, ABIs being what they are, we get to support both :/ But I'm not
> sure mixing them is a great idea.

OK.

> Also, clearly this could use a comment...
>
> The 'new' composite doesnt have a hops field because the hardware that
> nessecitated that change doesn't report it, but we could easily add a
> field there.
>
> Suppose we add, mem_hops:3 (would 6 hops be too small?) and the
> corresponding PERF_MEM_HOPS_{NA, 0..6}

It's really 7 if we use remote && hop = 0 to mean the first hop.

If we're wanting to use some of the hop levels to represent
intra-chip/package hops then we could possibly use them all on a really
big system.

eg. you could imagine something like:

 L2 | 		        - local L2
 L2 | REMOTE | HOPS_0	- L2 of neighbour core
 L2 | REMOTE | HOPS_1	- L2 of near core on same chip (same 1/2 of chip)
 L2 | REMOTE | HOPS_2	- L2 of far core on same chip (other 1/2 of chip)
 L2 | REMOTE | HOPS_3	- L2 of sibling chip in same package
 L2 | REMOTE | HOPS_4	- L2 on separate package 1 hop away
 L2 | REMOTE | HOPS_5	- L2 on separate package 2 hops away
 L2 | REMOTE | HOPS_6	- L2 on separate package 3 hops away


Whether it's useful to represent all those levels I'm not sure, but it's
probably good if we have the ability.

I guess I'm 50/50 on whether that's enough levels, or whether we want
another bit to allow for future growth.

> Then I suppose you can encode things like:
> 
> 	L2			- local L2
> 	L2 | REMOTE		- remote L2 at an unspecified distance (NA)
> 	L2 | REMOTE | HOPS_0	- remote L2 on the same node
> 	L2 | REMOTE | HOPS_1	- remote L2 on a node 1 removed
> 
> Would that work?

Yeah that looks good to me.

cheers

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ