lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 5 Jan 2013 19:43:29 +0100
From:	Jiri Olsa <jolsa@...hat.com>
To:	Stephane Eranian <eranian@...gle.com>
Cc:	linux-kernel@...r.kernel.org, peterz@...radead.org, mingo@...e.hu,
	ak@...ux.intel.com, acme@...hat.com, namhyung.kim@....com
Subject: Re: [PATCH v4 08/18] perf/x86: add memory profiling via PEBS Load
 Latency

On Thu, Dec 20, 2012 at 04:41:38PM +0100, Stephane Eranian wrote:
> This patch adds support for memory profiling using the
> PEBS Load Latency facility.
> 
> Load accesses are sampled by HW and the instruction
> address, data address, load latency, data source, tlb,
> locked information can be saved in the sampling buffer
> if using the PERF_SAMPLE_COST (for latency),

PERF_SAMPLE_WEIGHT ?

> PERF_SAMPLE_ADDR, PERF_SAMPLE_DSRC types.
> 
> To enable PEBS Load Latency, users have to use the
> model specific event:
> - on NHM/WSM: MEM_INST_RETIRED:LATENCY_ABOVE_THRESHOLD
> - on SNB/IVB: MEM_TRANS_RETIRED:LATENCY_ABOVE_THRESHOLD
> 
> To make things easier, this patch also exports a generic
> alias via sysfs: mem-loads. It export the right event
> encoding based on the host CPU and can be used directly
> by the perf tool.
> 
> Loosely based on Intel's Lin Ming patch posted on LKML
> in July 2011.
> 
> Signed-off-by: Stephane Eranian <eranian@...gle.com>

SNIP

> +/*
> + * Map PEBS Load Latency Data Source encodings to generic
> + * memory data source information
> + */
> +#define P(a, b) PERF_MEM_S(a, b)
> +#define OP_LH (P(OP, LOAD) | P(LVL, HIT))
> +#define SNOOP_NONE_MISS (P(SNOOP, NONE) | P(SNOOP, MISS))
> +

I checked Intel SDM 'Table 18-13. Data Source Encoding for Load Latency Record'
and it seems to be different (below) at some points.. did you use another source?

> +static const u64 pebs_data_source[] = {
> +	P(OP, LOAD) | P(LVL, MISS) | P(LVL, L3) | P(SNOOP, NA),/* 0x00:ukn L3 */
> +	OP_LH | P(LVL, L1) | P(SNOOP, NONE),	/* 0x01: L1 local */
> +	OP_LH | P(LVL, LFB)| P(SNOOP, NONE),	/* 0x02: LFB hit */
> +	OP_LH | P(LVL, L2) | P(SNOOP, NONE),	/* 0x03: L2 hit */
> +	OP_LH | P(LVL, L3) | P(SNOOP, NONE),	/* 0x04: L3 hit */
> +	OP_LH | P(LVL, L3) | P(SNOOP, MISS),	/* 0x05: L3 hit, snoop miss */
> +	OP_LH | P(LVL, L3) | P(SNOOP, HIT),	/* 0x06: L3 hit, snoop hit */

0x6:
L3 HIT. Local or Remote home requests that hit the L3 cache and was serviced by
another processor core with a cross core snoop where modified copies were found.
(HITM).


> +	OP_LH | P(LVL, L3) | P(SNOOP, HITM),	/* 0x07: L3 hit, snoop hitm */

0x7:
Reserved

> +	OP_LH | P(LVL, REM_CCE1) | P(SNOOP, HIT),  /* 0x08: L3 miss snoop hit */
> +	OP_LH | P(LVL, REM_CCE1) | P(SNOOP, HITM), /* 0x09: L3 miss snoop hitm*/

0x9:
Reserved

> +	OP_LH | P(LVL, LOC_RAM)  | P(SNOOP, HIT),  /* 0x0a: L3 miss, shared */
> +	OP_LH | P(LVL, REM_RAM1) | P(SNOOP, HIT),  /* 0x0b: L3 miss, shared */
> +	OP_LH | P(LVL, LOC_RAM)  | SNOOP_NONE_MISS,/* 0x0c: L3 miss, excl */
> +	OP_LH | P(LVL, REM_RAM1) | SNOOP_NONE_MISS,/* 0x0d: L3 miss, excl */
> +	OP_LH | P(LVL, IO) | P(SNOOP, NONE), /* 0x0e: I/O */
> +	OP_LH | P(LVL,UNC) | P(SNOOP, NONE), /* 0x0f: uncached */
> +};

thanks,
jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists