[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240321060016.1464787-6-irogers@google.com>
Date: Wed, 20 Mar 2024 23:00:09 -0700
From: Ian Rogers <irogers@...gle.com>
To: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>,
Ian Rogers <irogers@...gle.com>, Adrian Hunter <adrian.hunter@...el.com>,
Kan Liang <kan.liang@...ux.intel.com>, linux-perf-users@...r.kernel.org,
linux-kernel@...r.kernel.org, Perry Taylor <perry.taylor@...el.com>,
Samantha Alt <samantha.alt@...el.com>, Caleb Biggers <caleb.biggers@...el.com>,
Weilin Wang <weilin.wang@...el.com>, Edward Baker <edward.baker@...el.com>
Subject: [PATCH v1 05/12] perf vendor events intel: Update lunarlake to 1.01
Update events from 1.00 to 1.01 as released in:
https://github.com/intel/perfmon/commit/56ab8d837ac566d51a4d8748b6b4b817a22c9b84
Various encoding and description updates. Adds the events
CPU_CLK_UNHALTED.CORE, CPU_CLK_UNHALTED.CORE_P,
CPU_CLK_UNHALTED.REF_TSC_P, CPU_CLK_UNHALTED.THREAD,
MISC_RETIRED.LBR_INSERTS, TOPDOWN_BAD_SPECULATION.ALL_P,
TOPDOWN_BE_BOUND.ALL_P, TOPDOWN_FE_BOUND.ALL_P,
TOPDOWN_RETIRING.ALL_P.
Signed-off-by: Ian Rogers <irogers@...gle.com>
---
.../pmu-events/arch/x86/lunarlake/cache.json | 24 ++--
.../arch/x86/lunarlake/frontend.json | 2 +-
.../pmu-events/arch/x86/lunarlake/memory.json | 4 +-
.../pmu-events/arch/x86/lunarlake/other.json | 4 +-
.../arch/x86/lunarlake/pipeline.json | 109 +++++++++++++++---
tools/perf/pmu-events/arch/x86/mapfile.csv | 2 +-
6 files changed, 113 insertions(+), 32 deletions(-)
diff --git a/tools/perf/pmu-events/arch/x86/lunarlake/cache.json b/tools/perf/pmu-events/arch/x86/lunarlake/cache.json
index 1823149067b5..fb48be357c4e 100644
--- a/tools/perf/pmu-events/arch/x86/lunarlake/cache.json
+++ b/tools/perf/pmu-events/arch/x86/lunarlake/cache.json
@@ -31,7 +31,7 @@
"EventCode": "0x2e",
"EventName": "LONGEST_LAT_CACHE.REFERENCE",
"PublicDescription": "Counts the number of cacheable memory requests that access the Last Level Cache (LLC). Requests include demand loads, reads for ownership (RFO), instruction fetches and L1 HW prefetches. If the platform has an L3 cache, the LLC is the L3 cache, otherwise it is the L2 cache. Counts on a per core basis.",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "200003",
"UMask": "0x4f",
"Unit": "cpu_atom"
},
@@ -94,7 +94,7 @@
"MSRIndex": "0x3F6",
"MSRValue": "0x400",
"PEBS": "2",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "200003",
"UMask": "0x5",
"Unit": "cpu_atom"
},
@@ -106,7 +106,7 @@
"MSRIndex": "0x3F6",
"MSRValue": "0x80",
"PEBS": "2",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "200003",
"UMask": "0x5",
"Unit": "cpu_atom"
},
@@ -118,7 +118,7 @@
"MSRIndex": "0x3F6",
"MSRValue": "0x10",
"PEBS": "2",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "200003",
"UMask": "0x5",
"Unit": "cpu_atom"
},
@@ -130,7 +130,7 @@
"MSRIndex": "0x3F6",
"MSRValue": "0x800",
"PEBS": "2",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "200003",
"UMask": "0x5",
"Unit": "cpu_atom"
},
@@ -142,7 +142,7 @@
"MSRIndex": "0x3F6",
"MSRValue": "0x100",
"PEBS": "2",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "200003",
"UMask": "0x5",
"Unit": "cpu_atom"
},
@@ -154,7 +154,7 @@
"MSRIndex": "0x3F6",
"MSRValue": "0x20",
"PEBS": "2",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "200003",
"UMask": "0x5",
"Unit": "cpu_atom"
},
@@ -166,7 +166,7 @@
"MSRIndex": "0x3F6",
"MSRValue": "0x4",
"PEBS": "2",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "200003",
"UMask": "0x5",
"Unit": "cpu_atom"
},
@@ -178,7 +178,7 @@
"MSRIndex": "0x3F6",
"MSRValue": "0x200",
"PEBS": "2",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "200003",
"UMask": "0x5",
"Unit": "cpu_atom"
},
@@ -190,7 +190,7 @@
"MSRIndex": "0x3F6",
"MSRValue": "0x40",
"PEBS": "2",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "200003",
"UMask": "0x5",
"Unit": "cpu_atom"
},
@@ -202,7 +202,7 @@
"MSRIndex": "0x3F6",
"MSRValue": "0x8",
"PEBS": "2",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "200003",
"UMask": "0x5",
"Unit": "cpu_atom"
},
@@ -212,7 +212,7 @@
"EventCode": "0xd0",
"EventName": "MEM_UOPS_RETIRED.STORE_LATENCY",
"PEBS": "2",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "200003",
"UMask": "0x6",
"Unit": "cpu_atom"
}
diff --git a/tools/perf/pmu-events/arch/x86/lunarlake/frontend.json b/tools/perf/pmu-events/arch/x86/lunarlake/frontend.json
index 5e4ef81b43d6..3a24934e8d6e 100644
--- a/tools/perf/pmu-events/arch/x86/lunarlake/frontend.json
+++ b/tools/perf/pmu-events/arch/x86/lunarlake/frontend.json
@@ -19,7 +19,7 @@
"BriefDescription": "This event counts a subset of the Topdown Slots event that were no operation was delivered to the back-end pipeline due to instruction fetch limitations when the back-end could have accepted more operations. Common examples include instruction cache misses or x86 instruction decode limitations.",
"EventCode": "0x9c",
"EventName": "IDQ_BUBBLES.CORE",
- "PublicDescription": "This event counts a subset of the Topdown Slots event that were no operation was delivered to the back-end pipeline due to instruction fetch limitations when the back-end could have accepted more operations. Common examples include instruction cache misses or x86 instruction decode limitations.\nSoftware can use this event as the numerator for the Frontend Bound metric (or top-level category) of the Top-down Microarchitecture Analysis method.",
+ "PublicDescription": "This event counts a subset of the Topdown Slots event that were no operation was delivered to the back-end pipeline due to instruction fetch limitations when the back-end could have accepted more operations. Common examples include instruction cache misses or x86 instruction decode limitations. Software can use this event as the numerator for the Frontend Bound metric (or top-level category) of the Top-down Microarchitecture Analysis method.",
"SampleAfterValue": "1000003",
"UMask": "0x1",
"Unit": "cpu_core"
diff --git a/tools/perf/pmu-events/arch/x86/lunarlake/memory.json b/tools/perf/pmu-events/arch/x86/lunarlake/memory.json
index 51d70ba00bd4..9c188d80b7b9 100644
--- a/tools/perf/pmu-events/arch/x86/lunarlake/memory.json
+++ b/tools/perf/pmu-events/arch/x86/lunarlake/memory.json
@@ -155,7 +155,7 @@
"EventCode": "0x2A,0x2B",
"EventName": "OCR.DEMAND_DATA_RD.L3_MISS",
"MSRIndex": "0x1a6,0x1a7",
- "MSRValue": "0x3FBFC00001",
+ "MSRValue": "0xFE7F8000001",
"SampleAfterValue": "100003",
"UMask": "0x1",
"Unit": "cpu_core"
@@ -175,7 +175,7 @@
"EventCode": "0x2A,0x2B",
"EventName": "OCR.DEMAND_RFO.L3_MISS",
"MSRIndex": "0x1a6,0x1a7",
- "MSRValue": "0x3FBFC00002",
+ "MSRValue": "0xFE7F8000002",
"SampleAfterValue": "100003",
"UMask": "0x1",
"Unit": "cpu_core"
diff --git a/tools/perf/pmu-events/arch/x86/lunarlake/other.json b/tools/perf/pmu-events/arch/x86/lunarlake/other.json
index 69adaed5686d..377f717db6cc 100644
--- a/tools/perf/pmu-events/arch/x86/lunarlake/other.json
+++ b/tools/perf/pmu-events/arch/x86/lunarlake/other.json
@@ -24,7 +24,7 @@
"EventCode": "0xB7",
"EventName": "OCR.DEMAND_DATA_RD.DRAM",
"MSRIndex": "0x1a6,0x1a7",
- "MSRValue": "0x184000001",
+ "MSRValue": "0x1FBC000001",
"SampleAfterValue": "100003",
"UMask": "0x1",
"Unit": "cpu_atom"
@@ -34,7 +34,7 @@
"EventCode": "0x2A,0x2B",
"EventName": "OCR.DEMAND_DATA_RD.DRAM",
"MSRIndex": "0x1a6,0x1a7",
- "MSRValue": "0x184000001",
+ "MSRValue": "0x1E780000001",
"SampleAfterValue": "100003",
"UMask": "0x1",
"Unit": "cpu_core"
diff --git a/tools/perf/pmu-events/arch/x86/lunarlake/pipeline.json b/tools/perf/pmu-events/arch/x86/lunarlake/pipeline.json
index 2bde664fdc0f..2c9f85ec8c4a 100644
--- a/tools/perf/pmu-events/arch/x86/lunarlake/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/lunarlake/pipeline.json
@@ -38,10 +38,18 @@
{
"BriefDescription": "Fixed Counter: Counts the number of unhalted core clock cycles",
"EventName": "CPU_CLK_UNHALTED.CORE",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "2000003",
"UMask": "0x2",
"Unit": "cpu_atom"
},
+ {
+ "BriefDescription": "Core cycles when the core is not in a halt state.",
+ "EventName": "CPU_CLK_UNHALTED.CORE",
+ "PublicDescription": "Counts the number of core cycles while the core is not in a halt state. The core enters the halt state when it is running the HLT instruction. This event is a component in many key event ratios. The core frequency may change from time to time due to transitions associated with Enhanced Intel SpeedStep Technology or TM2. For this reason this event may have a changing ratio with regards to time. When the core frequency is constant, this event can approximate elapsed time while the core was not in the halt state. It is counted on a dedicated fixed counter, leaving the programmable counters available for other events.",
+ "SampleAfterValue": "2000003",
+ "UMask": "0x2",
+ "Unit": "cpu_core"
+ },
{
"BriefDescription": "Counts the number of unhalted core clock cycles [This event is alias to CPU_CLK_UNHALTED.THREAD_P]",
"EventCode": "0x3c",
@@ -49,10 +57,18 @@
"SampleAfterValue": "2000003",
"Unit": "cpu_atom"
},
+ {
+ "BriefDescription": "Thread cycles when thread is not in halt state [This event is alias to CPU_CLK_UNHALTED.THREAD_P]",
+ "EventCode": "0x3c",
+ "EventName": "CPU_CLK_UNHALTED.CORE_P",
+ "PublicDescription": "This is an architectural event that counts the number of thread cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling. For this reason, this event may have a changing ratio with regards to wall clock time. [This event is alias to CPU_CLK_UNHALTED.THREAD_P]",
+ "SampleAfterValue": "2000003",
+ "Unit": "cpu_core"
+ },
{
"BriefDescription": "Fixed Counter: Counts the number of unhalted reference clock cycles",
"EventName": "CPU_CLK_UNHALTED.REF_TSC",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "2000003",
"UMask": "0x3",
"Unit": "cpu_atom"
},
@@ -64,6 +80,15 @@
"UMask": "0x3",
"Unit": "cpu_core"
},
+ {
+ "BriefDescription": "Counts the number of unhalted reference clock cycles",
+ "EventCode": "0x3c",
+ "EventName": "CPU_CLK_UNHALTED.REF_TSC_P",
+ "PublicDescription": "Counts the number of reference cycles that the core is not in a halt state. The core enters the halt state when it is running the HLT instruction. This event is not affected by core frequency changes and increments at a fixed frequency that is also used for the Time Stamp Counter (TSC). This event uses a programmable general purpose performance counter.",
+ "SampleAfterValue": "2000003",
+ "UMask": "0x1",
+ "Unit": "cpu_atom"
+ },
{
"BriefDescription": "Reference cycles when the core is not in halt state.",
"EventCode": "0x3c",
@@ -74,9 +99,16 @@
"Unit": "cpu_core"
},
{
- "BriefDescription": "Core cycles when the thread is not in halt state",
+ "BriefDescription": "Fixed Counter: Counts the number of unhalted core clock cycles",
+ "EventName": "CPU_CLK_UNHALTED.THREAD",
+ "SampleAfterValue": "2000003",
+ "UMask": "0x2",
+ "Unit": "cpu_atom"
+ },
+ {
+ "BriefDescription": "Core cycles when the thread is not in a halt state.",
"EventName": "CPU_CLK_UNHALTED.THREAD",
- "PublicDescription": "Counts the number of core cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. This event is a component in many key event ratios. The core frequency may change from time to time due to transitions associated with Enhanced Intel SpeedStep Technology or TM2. For this reason this event may have a changing ratio with regards to time. When the core frequency is constant, this event can approximate elapsed time while the core was not in the halt state. It is counted on a dedicated fixed counter, leaving the eight programmable counters available for other events.",
+ "PublicDescription": "Counts the number of core cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. This event is a component in many key event ratios. The core frequency may change from time to time due to transitions associated with Enhanced Intel SpeedStep Technology or TM2. For this reason this event may have a changing ratio with regards to time. When the core frequency is constant, this event can approximate elapsed time while the core was not in the halt state. It is counted on a dedicated fixed counter, leaving the programmable counters available for other events.",
"SampleAfterValue": "2000003",
"UMask": "0x2",
"Unit": "cpu_core"
@@ -89,10 +121,10 @@
"Unit": "cpu_atom"
},
{
- "BriefDescription": "Thread cycles when thread is not in halt state",
+ "BriefDescription": "Thread cycles when thread is not in halt state [This event is alias to CPU_CLK_UNHALTED.CORE_P]",
"EventCode": "0x3c",
"EventName": "CPU_CLK_UNHALTED.THREAD_P",
- "PublicDescription": "This is an architectural event that counts the number of thread cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling. For this reason, this event may have a changing ratio with regards to wall clock time.",
+ "PublicDescription": "This is an architectural event that counts the number of thread cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling. For this reason, this event may have a changing ratio with regards to wall clock time. [This event is alias to CPU_CLK_UNHALTED.CORE_P]",
"SampleAfterValue": "2000003",
"Unit": "cpu_core"
},
@@ -100,7 +132,7 @@
"BriefDescription": "Fixed Counter: Counts the number of instructions retired",
"EventName": "INST_RETIRED.ANY",
"PEBS": "1",
- "SampleAfterValue": "1000003",
+ "SampleAfterValue": "2000003",
"UMask": "0x1",
"Unit": "cpu_atom"
},
@@ -148,11 +180,29 @@
"UMask": "0x82",
"Unit": "cpu_core"
},
+ {
+ "BriefDescription": "Counts the number of LBR entries recorded. Requires LBRs to be enabled in IA32_LBR_CTL.",
+ "EventCode": "0xe4",
+ "EventName": "MISC_RETIRED.LBR_INSERTS",
+ "PEBS": "1",
+ "SampleAfterValue": "1000003",
+ "UMask": "0x1",
+ "Unit": "cpu_atom"
+ },
+ {
+ "BriefDescription": "LBR record is inserted",
+ "EventCode": "0xe4",
+ "EventName": "MISC_RETIRED.LBR_INSERTS",
+ "PEBS": "1",
+ "SampleAfterValue": "1000003",
+ "UMask": "0x1",
+ "Unit": "cpu_core"
+ },
{
"BriefDescription": "This event counts a subset of the Topdown Slots event that were not consumed by the back-end pipeline due to lack of back-end resources, as a result of memory subsystem delays, execution units limitations, or other conditions.",
"EventCode": "0xa4",
"EventName": "TOPDOWN.BACKEND_BOUND_SLOTS",
- "PublicDescription": "This event counts a subset of the Topdown Slots event that were not consumed by the back-end pipeline due to lack of back-end resources, as a result of memory subsystem delays, execution units limitations, or other conditions.\nSoftware can use this event as the numerator for the Backend Bound metric (or top-level category) of the Top-down Microarchitecture Analysis method.",
+ "PublicDescription": "This event counts a subset of the Topdown Slots event that were not consumed by the back-end pipeline due to lack of back-end resources, as a result of memory subsystem delays, execution units limitations, or other conditions. Software can use this event as the numerator for the Backend Bound metric (or top-level category) of the Top-down Microarchitecture Analysis method.",
"SampleAfterValue": "10000003",
"UMask": "0x2",
"Unit": "cpu_core"
@@ -175,21 +225,35 @@
"Unit": "cpu_core"
},
{
- "BriefDescription": "Fixed Counter: Counts the number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear.",
+ "BriefDescription": "Counts the number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear. [This event is alias to TOPDOWN_BAD_SPECULATIONALL_P]",
+ "EventCode": "0x73",
"EventName": "TOPDOWN_BAD_SPECULATION.ALL",
- "PublicDescription": "Fixed Counter: Counts the number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear. Counts all issue slots blocked during this recovery window including relevant microcode flows and while uops are not yet available in the IQ. Also, includes the issue slots that were consumed by the backend but were thrown away because they were younger than the mispredict or machine clear.",
"SampleAfterValue": "1000003",
- "UMask": "0x5",
"Unit": "cpu_atom"
},
{
- "BriefDescription": "Counts the number of retirement slots not consumed due to backend stalls",
+ "BriefDescription": "Counts the number of issue slots that were not consumed by the backend because allocation is stalled due to a mispredicted jump or a machine clear. [This event is alias to TOPDOWN_BAD_SPECULATIONALL]",
+ "EventCode": "0x73",
+ "EventName": "TOPDOWN_BAD_SPECULATION.ALL_P",
+ "SampleAfterValue": "1000003",
+ "Unit": "cpu_atom"
+ },
+ {
+ "BriefDescription": "Counts the number of retirement slots not consumed due to backend stalls [This event is alias to TOPDOWN_BE_BOUND.ALL_P]",
"EventCode": "0xa4",
"EventName": "TOPDOWN_BE_BOUND.ALL",
"SampleAfterValue": "1000003",
"UMask": "0x2",
"Unit": "cpu_atom"
},
+ {
+ "BriefDescription": "Counts the number of retirement slots not consumed due to backend stalls [This event is alias to TOPDOWN_BE_BOUND.ALL]",
+ "EventCode": "0xa4",
+ "EventName": "TOPDOWN_BE_BOUND.ALL_P",
+ "SampleAfterValue": "1000003",
+ "UMask": "0x2",
+ "Unit": "cpu_atom"
+ },
{
"BriefDescription": "Fixed Counter: Counts the number of retirement slots not consumed due to front end stalls",
"EventName": "TOPDOWN_FE_BOUND.ALL",
@@ -198,18 +262,35 @@
"Unit": "cpu_atom"
},
{
- "BriefDescription": "Fixed Counter: Counts the number of consumed retirement slots. Similar to UOPS_RETIRED.ALL",
+ "BriefDescription": "Counts the number of retirement slots not consumed due to front end stalls",
+ "EventCode": "0x9c",
+ "EventName": "TOPDOWN_FE_BOUND.ALL_P",
+ "SampleAfterValue": "1000003",
+ "UMask": "0x1",
+ "Unit": "cpu_atom"
+ },
+ {
+ "BriefDescription": "Fixed Counter: Counts the number of consumed retirement slots.",
"EventName": "TOPDOWN_RETIRING.ALL",
"PEBS": "1",
"SampleAfterValue": "1000003",
"UMask": "0x7",
"Unit": "cpu_atom"
},
+ {
+ "BriefDescription": "Counts the number of consumed retirement slots.",
+ "EventCode": "0xc2",
+ "EventName": "TOPDOWN_RETIRING.ALL_P",
+ "PEBS": "1",
+ "SampleAfterValue": "1000003",
+ "UMask": "0x2",
+ "Unit": "cpu_atom"
+ },
{
"BriefDescription": "This event counts a subset of the Topdown Slots event that are utilized by operations that eventually get retired (committed) by the processor pipeline. Usually, this event positively correlates with higher performance for example, as measured by the instructions-per-cycle metric.",
"EventCode": "0xc2",
"EventName": "UOPS_RETIRED.SLOTS",
- "PublicDescription": "This event counts a subset of the Topdown Slots event that are utilized by operations that eventually get retired (committed) by the processor pipeline. Usually, this event positively correlates with higher performance for example, as measured by the instructions-per-cycle metric.\nSoftware can use this event as the numerator for the Retiring metric (or top-level category) of the Top-down Microarchitecture Analysis method.",
+ "PublicDescription": "This event counts a subset of the Topdown Slots event that are utilized by operations that eventually get retired (committed) by the processor pipeline. Usually, this event positively correlates with higher performance for example, as measured by the instructions-per-cycle metric. Software can use this event as the numerator for the Retiring metric (or top-level category) of the Top-down Microarchitecture Analysis method.",
"SampleAfterValue": "2000003",
"UMask": "0x2",
"Unit": "cpu_core"
diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index a219dc3bd1ef..710f8dfefeed 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -20,7 +20,7 @@ GenuineIntel-6-3A,v24,ivybridge,core
GenuineIntel-6-3E,v24,ivytown,core
GenuineIntel-6-2D,v24,jaketown,core
GenuineIntel-6-(57|85),v16,knightslanding,core
-GenuineIntel-6-BD,v1.00,lunarlake,core
+GenuineIntel-6-BD,v1.01,lunarlake,core
GenuineIntel-6-A[AC],v1.07,meteorlake,core
GenuineIntel-6-1[AEF],v4,nehalemep,core
GenuineIntel-6-2E,v4,nehalemex,core
--
2.44.0.396.g6e790dbe36-goog
Powered by blists - more mailing lists