[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140219152354.GW12219@tassilo.jf.intel.com>
Date: Wed, 19 Feb 2014 07:23:54 -0800
From: Andi Kleen <ak@...ux.intel.com>
To: Dave Hansen <dave.hansen@...ux.intel.com>
Cc: LKML <linux-kernel@...r.kernel.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...stprotocols.net>
Subject: Re: x86 perf's dTLB-load-misses broken on IvyBridge?
On Tue, Feb 18, 2014 at 03:11:59PM -0800, Dave Hansen wrote:
> I noticed that perf's dTLB-load-misses even t isn't working on my
> Ivybridge system:
>
> > Performance counter stats for 'system wide':
> >
> > 0 dTLB-load-misses [100.00%]
> > 48,570 dTLB-store-misses [100.00%]
> > 202,573 iTLB-loads [100.00%]
> > 271,546 iTLB-load-misses # 134.05% of all iTLB cache hits
>
> But it works on a SandyBridge system that I have.
>
> arch/x86/kernel/cpu/perf_event_intel.c seems to use the same tables for
> SandyBridge and IvyBridge, so they both use the
> 'MEM_UOP_RETIRED.ALL_LOADS' event:
>
> > [ C(DTLB) ] = {
> > [ C(OP_READ) ] = {
> > [ C(RESULT_ACCESS) ] = 0x81d0, /* MEM_UOP_RETIRED.ALL_LOADS */
> > [ C(RESULT_MISS) ] = 0x0108, /* DTLB_LOAD_MISSES.CAUSES_A_WALK */
> > },
>
> But that event looks to be unsupported on this CPU:
I thought you wanted the miss event?
That would be the second entry.
ALL_LOADS is the access event. it works for me, both raw and perf cooked
(not sure why the two numbers are different though)
% perf stat -e dTLB-loads,r81d0 -a sleep 1
Performance counter stats for 'system wide':
12,685,064 dTLB-loads [100.00%]
13,277,367 r81d0
1.001420860 seconds time elapsed
Miss event count too:
perf stat -e dTLB-load-misses,dTLB-load -a sleep 1
Performance counter stats for 'system wide':
19,504 dTLB-load-misses # 0.30% of all dTLB cache hits [100.00%]
6,471,308 dTLB-load
1.001894328 seconds time elapsed
Same raw:
perf stat -e r0108 -a sleep 1
Performance counter stats for 'system wide':
82,285 r0108
1.001353060 seconds time elapsed
> > perf stat -a -e cpu/event=0xd0,umask=0x81,name=mem_uops_retired_all_loads/ sleep 1
> >
> > Performance counter stats for 'system wide':
> >
> > <not supported> mem_uops_retired_all_loads
> > 50,204,763 mem_uops_retired_all_loads_ps
>
> But there's a "_ps" version which uses PEBS which does work?
Both works for me on a IvyBridge.
> Should we swap perf_event_intel.c over to use the PEBS version so that
> it works everywhere?
Shouldn't be needed.
PEBS for counting normally doesn't make much sense.
-Andi
--
ak@...ux.intel.com -- Speaking for myself only
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists