lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 16 Feb 2013 15:14:35 +0100
From:	Stephane Eranian <eranian@...gle.com>
To:	Benjamin Herrenschmidt <benh@...nel.crashing.org>
Cc:	Sukadev Bhattiprolu <sukadev@...ux.vnet.ibm.com>,
	Ingo Molnar <mingo@...nel.org>,
	Michael Ellerman <michael@...erman.id.au>,
	Paul Mackerras <paulus@...ba.org>,
	Maynard Johnson <mpjohn@...ibm.com>,
	Anton Blanchard <anton@...ba.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	"mingo@...e.hu" <mingo@...e.hu>,
	"ak@...ux.intel.com" <ak@...ux.intel.com>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	Jiri Olsa <jolsa@...hat.com>,
	Namhyung Kim <namhyung.kim@....com>
Subject: Re: [PATCH v7 07/18] perf: add generic memory sampling interface

On Sat, Feb 16, 2013 at 3:45 AM, Benjamin Herrenschmidt
<benh@...nel.crashing.org> wrote:
> On Fri, 2013-02-15 at 11:46 -0800, Sukadev Bhattiprolu wrote:
>>
>> POWER could use an additional field:
>>
>>                         mem_deratmiss:1
>
> If you want to make that field more "generic" make it "lvl1_tlb_miss",
> ie, a miss in the internal "level 1" TLB which is the smallest/fastest
> TLB level in the load/store unit.
>
If you want to express a L1 TLB miss, you already can do that:

PERF_MEM_S(TLB, MISS) | PERF_MEM_S(TLB, L1)

If this is a feature you do not support, then use the NA macro.

For instance:
PERF_MEM_S(LOCK, NA)

Need to be able to differentiate not supported from did not happen.


>> AFAICT, POWER does not currently save the mem_op, snoop or lock info
>> for the sampled instruction.  I guess we can leave them set to 0.
>
> Well, we don't have lock instructions to begin with :-) If we can read
> the IP then we can deduce the memop tho.
>
>> > > +};
>> > > +
>> > > +/* type of opcode (load/store/prefetch,code) */
>> > > +#define PERF_MEM_OP_NA             0x01 /* not available */
>> > > +#define PERF_MEM_OP_LOAD   0x02 /* load instruction */
>> > > +#define PERF_MEM_OP_STORE  0x04 /* store instruction */
>> > > +#define PERF_MEM_OP_PFETCH 0x08 /* prefetch */
>> > > +#define PERF_MEM_OP_EXEC   0x10 /* code (execution) */
>> > > +#define PERF_MEM_OP_SHIFT  0
>> > > +
>> > > +/* memory hierarchy (memory level, hit or miss) */
>> > > +#define PERF_MEM_LVL_NA            0x01  /* not available */
>> > > +#define PERF_MEM_LVL_HIT   0x02  /* hit level */
>> > > +#define PERF_MEM_LVL_MISS  0x04  /* miss level  */
>> > > +#define PERF_MEM_LVL_L1            0x08  /* L1 */
>> > > +#define PERF_MEM_LVL_LFB   0x10  /* Line Fill Buffer */
>> > > +#define PERF_MEM_LVL_L2            0x20  /* L2 hit */
>> > > +#define PERF_MEM_LVL_L3            0x40  /* L3 hit */
>> > > +#define PERF_MEM_LVL_LOC_RAM       0x80  /* Local DRAM */
>> > > +#define PERF_MEM_LVL_REM_RAM1      0x100 /* Remote DRAM (1 hop)
>> */
>> > > +#define PERF_MEM_LVL_REM_RAM2      0x200 /* Remote DRAM (2 hops)
>> */
>> > > +#define PERF_MEM_LVL_REM_CCE1      0x400 /* Remote Cache (1 hop)
>> */
>> > > +#define PERF_MEM_LVL_REM_CCE2      0x800 /* Remote Cache (2 hops)
>> */
>> > > +#define PERF_MEM_LVL_IO            0x1000 /* I/O memory */
>> > > +#define PERF_MEM_LVL_UNC   0x2000 /* Uncached memory */
>> > > +#define PERF_MEM_LVL_SHIFT 5
>>
>> POWER saves following information to describe where the data was
>> loaded from after a Dcache or DTLB miss.
>>
>>         FROM_L2
>>         FROM_L3
>>
>>         FROM_L2.1_SHR   From another L2 or L3 on same chip,
>> shared
>>         FROM_L2.1_MOD   From another L2 or L3 on same chip, modified
>>
>>         FROM_L3.1_SHR   From remote L2 or L3, shared
>>         FROM_L3.1_MOD   From remote L2 or L3, modified
>>
>>         FROM_RL2L3_SHR  From remote L2 or L3, shared
>>         FROM_RL2L3_MOD  From remote L2 or L3, modified
>>
>>         FROM_DL2L3_SHR  From distant L2 or L3, shared
>>         FROM_DL2L3_MOD  From distant L2 or L3, modified
>>
>> POWER uses 4 bits and a running count for its (currently) 13 possible
>> values.
>>
>> The macros in the patch use a separate bit for each level - is that to
>> allow
>> selecting more than one level at the same time ? If so, we will need
>> to reserve
>> a few more bits to allow for Power's memory levels that don't map to
>> the above.
>>
>> > > +
>> > > +/* snoop mode */
>> > > +#define PERF_MEM_SNOOP_NA  0x01 /* not available */
>> > > +#define PERF_MEM_SNOOP_NONE        0x02 /* no snoop */
>> > > +#define PERF_MEM_SNOOP_HIT 0x04 /* snoop hit */
>> > > +#define PERF_MEM_SNOOP_MISS        0x08 /* snoop miss */
>> > > +#define PERF_MEM_SNOOP_HITM        0x10 /* snoop hit modified */
>> > > +#define PERF_MEM_SNOOP_SHIFT       19
>> > > +
>> > > +/* locked instruction */
>> > > +#define PERF_MEM_LOCK_NA   0x01 /* not available */
>> > > +#define PERF_MEM_LOCK_LOCKED       0x02 /* locked transaction */
>> > > +#define PERF_MEM_LOCK_SHIFT        24
>> > > +
>> > > +/* TLB access */
>> > > +#define PERF_MEM_TLB_NA            0x01 /* not available */
>> > > +#define PERF_MEM_TLB_HIT   0x02 /* hit level */
>> > > +#define PERF_MEM_TLB_MISS  0x04 /* miss level */
>> > > +#define PERF_MEM_TLB_L1            0x08 /* L1 */
>> > > +#define PERF_MEM_TLB_L2            0x10 /* L2 */
>> > > +#define PERF_MEM_TLB_WK            0x20 /* Hardware Walker*/
>> > > +#define PERF_MEM_TLB_OS            0x40 /* OS fault handler */
>> > > +#define PERF_MEM_TLB_SHIFT 26
>>
>> On POWER, like with the Dcache source above, we have 4 bits to
>> describe where
>> the DTLB was loaded from after a dTLB miss.
>>
>> We would probably need to allow more bits to for the memory level of
>> the dTLB
>> load source.
>>
>> > > +
>> > > +#define PERF_MEM_S(a, s) \
>> > > +   (((u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT)
>> > > +
>> >
>> > Would be nice to get feedback from PowerPC folks to see how well
>> > this matches their memory profiling hw capabilities?
>> >
>> > I suspect there's a lot of differences, but one can always hope
>> > ...
>> >
>> > If there's some hope for unification we could at least shape it
>> > in a way that they could pick up and extend.
>>
>> Thanks for Ccing.
>>
>> While on the topic of sampled instructions, POWER saves following
>> information
>> (in addition to the above memory info) for sampled instructions.
>>
>>         - whether the sampled instruction encountered a stall
>>         - the reasons for the stall.
>>         - whether the instruction was from hypervisor
>>         - there was a branch mis-predict,
>>         - thresholding information
>>
>> These are clubbed into an "event vector" that is saved for sampled
>> instructions. We have been meaning to find ways to present that to
>> to user space. Are there plans to retreive and present these too.
>
> Ben.
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ