[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CABPqkBSNRmXb8hbXMUOLuCpJmoLh-mmo3YGdqY1_zeepCVRNUg@mail.gmail.com>
Date: Wed, 23 Jan 2013 17:54:51 +0100
From: Stephane Eranian <eranian@...gle.com>
To: Andi Kleen <ak@...ux.intel.com>
Cc: LKML <linux-kernel@...r.kernel.org>,
Peter Zijlstra <peterz@...radead.org>,
"mingo@...e.hu" <mingo@...e.hu>,
Arnaldo Carvalho de Melo <acme@...hat.com>,
Jiri Olsa <jolsa@...hat.com>,
Namhyung Kim <namhyung.kim@....com>
Subject: Re: [PATCH v6 07/18] perf: add generic memory sampling interface
On Sat, Jan 19, 2013 at 12:06 AM, Andi Kleen <ak@...ux.intel.com> wrote:
>> extern void perf_output_sample(struct perf_output_handle *handle,
>> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
>> index 7e24641..8283218 100644
>> --- a/include/uapi/linux/perf_event.h
>> +++ b/include/uapi/linux/perf_event.h
>> @@ -133,9 +133,9 @@ enum perf_event_sample_format {
>> PERF_SAMPLE_REGS_USER = 1U << 12,
>> PERF_SAMPLE_STACK_USER = 1U << 13,
>> PERF_SAMPLE_WEIGHT = 1U << 14,
>> + PERF_SAMPLE_DSRC = 1U << 15,
>
> This conflicts with similar extensions in the Haswell patchkit,
> but that can be worked out by just moving some numbers (and making
> sure the input/output calls are still in the right place)
>
Yes, it all depends on which patch goes in first. No big deal.
>
>> +union perf_mem_dsrc {
>> + __u64 val;
>> + struct {
>> + __u64 mem_op:5, /* type of opcode */
>> + mem_lvl:14, /* memory hierarchy level */
>> + mem_snoop:5, /* snoop mode */
>> + mem_lock:2, /* lock instr */
>> + mem_dtlb:7, /* tlb access */
>> + mem_rsvd:31;
>> + };
>> +};
>> +
>> +/* type of opcode (load/store/prefetch,code) */
>> +#define PERF_MEM_OP_NA 0x01 /* not available */
>> +#define PERF_MEM_OP_LOAD 0x02 /* load instruction */
>> +#define PERF_MEM_OP_STORE 0x04 /* store instruction */
>> +#define PERF_MEM_OP_PFETCH 0x08 /* prefetch */
>> +#define PERF_MEM_OP_EXEC 0x10 /* code (execution) */
>> +#define PERF_MEM_OP_SHIFT 0
>
> Do we really need the shift? it's implicit in the bitfield right?
>
The bitfield is provided for reference for user code. It is not
used by the kernel code. We use plain u64 instead thus we
need the shift. This is used for the static pebs_data_source[]
table.
>> +/* memory hierarchy (memory level, hit or miss) */
>> +#define PERF_MEM_LVL_NA 0x01 /* not available */
>> +#define PERF_MEM_LVL_HIT 0x02 /* hit level */
>> +#define PERF_MEM_LVL_MISS 0x04 /* miss level */
>> +#define PERF_MEM_LVL_L1 0x08 /* L1 */
>> +#define PERF_MEM_LVL_LFB 0x10 /* Line Fill Buffer */
>> +#define PERF_MEM_LVL_L2 0x20 /* L2 hit */
>> +#define PERF_MEM_LVL_L3 0x40 /* L3 hit */
>> +#define PERF_MEM_LVL_LOC_RAM 0x80 /* Local DRAM */
>> +#define PERF_MEM_LVL_REM_RAM1 0x100 /* Remote DRAM (1 hop) */
>> +#define PERF_MEM_LVL_REM_RAM2 0x200 /* Remote DRAM (2 hops) */
>> +#define PERF_MEM_LVL_REM_CCE1 0x400 /* Remote Cache (1 hop) */
>> +#define PERF_MEM_LVL_REM_CCE2 0x800 /* Remote Cache (2 hops) */
>> +#define PERF_MEM_LVL_IO 0x1000 /* I/O memory */
>> +#define PERF_MEM_LVL_UNC 0x2000 /* Uncached memory */
>
> I would leave some free bits here, obviously this doesn't cover all
> that may be possible in system architecture. Also why is this a bit mask,
> you can only hit one level right? So perhaps a number.
>
Yeah, I have been going back and forth on how to best define this to leave
some room for extensions. For now, we have 31 bits left at the in the MSB
part of the u64. Could either leave them there or try to commandeer some
for the memory hierarchy field.
This can be a bitmask on architectures where the HW cannot determine
for sure where the line came from. It may provided best effort such as
missed L2 or L3.
>> +/* TLB access */
>> +#define PERF_MEM_TLB_NA 0x01 /* not available */
>> +#define PERF_MEM_TLB_HIT 0x02 /* hit level */
>> +#define PERF_MEM_TLB_MISS 0x04 /* miss level */
>> +#define PERF_MEM_TLB_L1 0x08 /* L1 */
>> +#define PERF_MEM_TLB_L2 0x10 /* L2 */
>> +#define PERF_MEM_TLB_WK 0x20 /* Hardware Walker*/
>> +#define PERF_MEM_TLB_OS 0x40 /* OS fault handler */
>
>
The current x86 PEBS-LL is a example of HW that cannot disambiguate
where the TLB access actually was. It can for instance return that the access
did not miss the 2nd level TLB which means: hit L1 TLB or L2 TLB. That's
why you need a bitmask.
> Same
>
>
>> +#define PERF_MEM_TLB_SHIFT 26
>> +
>> +#define PERF_MEM_S(a, s) \
>> + (((u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT)
>
> Is that used by anything?
>
>
Yes, it is used to populate the pebs_data_source[] in perf_event_intel_ds.c
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists