[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1309931621.18875.130.camel@minggr.sh.intel.com>
Date: Wed, 06 Jul 2011 13:53:41 +0800
From: Lin Ming <ming.m.lin@...el.com>
To: Peter Zijlstra <a.p.zijlstra@...llo.nl>
Cc: Ingo Molnar <mingo@...e.hu>, Andi Kleen <andi@...stfloor.org>,
Stephane Eranian <eranian@...gle.com>,
Arnaldo Carvalho de Melo <acme@...stprotocols.net>,
linux-kernel <linux-kernel@...r.kernel.org>,
Robert Richter <robert.richter@....com>
Subject: Re: [PATCH 1/4] perf: Add memory load/store events generic code
On Tue, 2011-07-05 at 22:17 +0800, Peter Zijlstra wrote:
> On Tue, 2011-07-05 at 19:54 +0800, Lin Ming wrote:
> > On Mon, 2011-07-04 at 19:16 +0800, Peter Zijlstra wrote:
> > > On Mon, 2011-07-04 at 08:02 +0000, Lin Ming wrote:
> > > > +#define MEM_STORE_DCU_HIT (1ULL << 0)
> > >
> > > I'm pretty sure that's not Dublin City University, but what is it?
> > > Data-Cache-Unit? what does that mean, L1/L2 or also L3?
> > >
> > > > +#define MEM_STORE_STLB_HIT (1ULL << 1)
> > >
> > > What's an sTLB? I know iTLB and dTLB's but sTLBs I've not heard of yet.
> > >
> > > > +#define MEM_STORE_LOCKED_ACCESS (1ULL << 2)
> > >
> > > Presumably that's about LOCK'ed ops?
> > >
> > > So now you're just tacking bits on the end without even attempting to
> > > generalize/unify things, not charmed at all.
> >
> > Any idea on the more useful store bits encoding?
>
> For two of them, sure:
>
> {load, store} x {atomic} x
> {hasSRC} x {l1, l2, l3, ram, unkown, io, uncached, reserved} x
> {hasLRS} x {local, remote, snoop} x
> {hasMESI} x {MESI}
>
> that would make MEM_STORE_DCU_HIT: store-l1 and MEM_STORE_LOCKED:
> store-atomic.
>
> Now this is needed for load-latency as well, since SNB extended the src
> information with the same STLB/LOCK bits.
>
> The SDM is somewhat inconsistent on what an STLB_MISS means:
>
> Table 30-22 says: 0 - did not miss STLB (hit the DTLB/STLB), 1 - missed
> the STLB.
>
> Table 30-23 says: "the store missed the STLB if set, otherwise the store
> hit the STLB", which simply cannot be true.
>
> So I'm sticking with 30-22.
>
> Now the above doesn't yet deal with TLBs nor can it map the IBS data
> source bits because afaict that can report a u-op as both a store and a
> load, but does not mention if a data-cache miss means L1 or L1/L2,
> Robert?
>
> One way to sort all that is not use enumerated spaces like above but
> simply explode the whole thing like: load x store x atomic x l1 x l2
> x ... that would of course give rise to a load of impossible
> combinations but would do away with the hasFOO bits.
>
> If the AMD data-cache means L1/L2 it can simply set both bits, same with
> the Intel STLB miss, it can set TLB1/TLB2 bits (AMD does split those
> nicely).
>
> With all those bits exploded we can also express the inverse of
> MEM_STORE_DCU_HIT as: store-l2-l3-dram, we simply set ~l1 for the
> appropriate submask (which should arguably include IO/uncached/unknown
> as well).
Do you mean to use the "impossible combinations" to express the inverse?
MEM_STORE_DCU_MISS as: store-l2-l3-dram
MEM_STORE_STLB_MISS as: store-itlb-dtlb
How about below code?
#define PERF_MEM_LOAD (1ULL << 0)
#define PERF_MEM_STORE (1ULL << 1)
#define PERF_MEM_ATOMIC (1ULL << 2)
#define PERF_MEM_L1 (1ULL << 3)
#define PERF_MEM_L2 (1ULL << 4)
#define PERF_MEM_L3 (1ULL << 5)
#define PERF_MEM_RAM (1ULL << 6)
#define PERF_MEM_UNKNOWN (1ULL << 7)
#define PERF_MEM_IO (1ULL << 8)
#define PERF_MEM_UNCACHED (1ULL << 9)
#define PERF_MEM_RESERVED (1ULL << 10)
#define PERF_MEM_LOCAL (1ULL << 11)
#define PERF_MEM_REMOTE (1ULL << 12)
#define PERF_MEM_SNOOP (1ULL << 13)
#define PERF_MEM_MODIFIED (1ULL << 14)
#define PERF_MEM_EXCLUSIVE (1ULL << 15)
#define PERF_MEM_SHARED (1ULL << 16)
#define PERF_MEM_INVALID (1ULL << 17)
#define PERF_MEM_ITLB (1ULL << 18)
#define PERF_MEM_DTLB (1ULL << 19)
#define PERF_MEM_STLB (1ULL << 20)
#define PERF_MEM_STORE_L1D_HIT \
(PERF_MEM_STORE | PERF_MEM_L1)
#define PERF_MEM_STORE_L1D_MISS \
(PERF_MEM_STORE | PERF_MEM_L2 | PERF_MEM_L3 | PERF_MEM_RAM)
#define PERF_MEM_STORE_STLB_HIT \
(PERF_MEM_STORE | PERF_MEM_STLB)
#define PERF_MEM_STORE_STLB_MISS \
(PERF_MEM_STORE | PERF_MEM_ITLB | PERF_MEM_DTLB)
#define PERF_MEM_STORE_ATOMIC \
(PERF_MEM_STORE | PERF_MEM_ATOMIC)
#define PERF_MEM_LOAD_STLB_HIT \
(PERF_MEM_LOAD | PERF_MEM_STLB)
#define PERF_MEM_LOAD_STLB_MISS \
(PERF_MEM_LOAD | PERF_MEM_ITLB | PERF_MEM_DTLB)
#define PERF_MEM_LOAD_ATOMIC \
(PERF_MEM_LOAD | PERF_MEM_ATOMIC)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists