[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <59321f8a-c804-4c0d-8941-3805c078d2c6@linux.intel.com>
Date: Thu, 27 Feb 2025 10:04:09 +0800
From: "Mi, Dapeng" <dapeng1.mi@...ux.intel.com>
To: "Liang, Kan" <kan.liang@...ux.intel.com>,
Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...hat.com>, Arnaldo Carvalho de Melo
<acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>,
Ian Rogers <irogers@...gle.com>, Adrian Hunter <adrian.hunter@...el.com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Andi Kleen <ak@...ux.intel.com>, Eranian Stephane <eranian@...gle.com>,
linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org,
Dapeng Mi <dapeng1.mi@...el.com>
Subject: Re: [Patch v2 10/24] perf/x86/intel: Process arch-PEBS records or
record fragments
On 2/26/2025 11:45 PM, Liang, Kan wrote:
>
> On 2025-02-26 4:35 a.m., Peter Zijlstra wrote:
>> On Wed, Feb 26, 2025 at 01:20:37PM +0800, Mi, Dapeng wrote:
>>
>>>> Also, should that workaround have been extended to also include
>>>> GLOBAL_STATUS_PERF_METRICS_OVF in that mask, or was that defect fixed
>>>> for every chip capable of metrics stuff?
>>> hmm, per my understanding, GLOBAL_STATUS_PERF_METRICS_OVF handling should
>>> only be skipped when fixed counter 3 or perf metrics are included in PEBS
>>> counter group. In this case, the slots and topdown metrics have been
>>> updated by PEBS handler. It should not be processed again.
>>>
>>> @Kan Liang, is it correct?
>> Right, so the thing is, *any* PEBS event pending will clear METRICS_OVF
>> per:
>>
>> status &= x86_pmu.intel_ctrl | GLOBAL_STATUS_TRACE_TOPAPMI;
>>
> Yes, we have to add it for both legacy PEBS and ARCH PEBS.
>
> An alternative way may change the order of handling the overflow bit.
>
> The commit daa864b8f8e3 ("perf/x86/pebs: Fix handling of PEBS buffer
> overflows") has moved the "status &= ~cpuc->pebs_enabled;" out of PEBS
> overflow code.
>
> As long as the PEBS overflow is handled after PT, I don't think the
> above is required anymore.
>
> It should be similar to METRICS_OVF. But the PEBS counters snapshotting
> should be specially handled, since the PEBS will handle the metrics
> counter as well.
>
> @@ -3211,7 +3211,8 @@ static int handle_pmi_common(struct pt_regs *regs,
> u64 status)
> /*
> * Intel Perf metrics
> */
> - if (__test_and_clear_bit(GLOBAL_STATUS_PERF_METRICS_OVF_BIT, (unsigned
> long *)&status)) {
> + if (__test_and_clear_bit(GLOBAL_STATUS_PERF_METRICS_OVF_BIT, (unsigned
> long *)&status) &&
> +
> !is_pebs_counter_event_group(cpuc->events[INTEL_PMC_IDX_FIXED_SLOTS])) {
> handled++;
> static_call(intel_pmu_update_topdown_event)(NULL, NULL);
> }
Yes, we still need to handle METRICS_OVF if fixed counter 3 and metrics are
not included into counter group. It ensure the metrics count can be updated
timely once PERF_METRICS MSR overflows.
Since there were more and more bits added into GLOBAL_CTRL_STAT MSR in
past several years, it becomes not correct to execute the below code in
BUFFER_OVF_BIT handling code.
status &= intel_ctrl | GLOBAL_STATUS_TRACE_TOPAPMI;
It unconditionally clears METRICS_OVF and other bits which could be added
in the future. This is incorrect and could introduce potential issues.
Combining Kan's change, I think we can change the code like this.
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index 8ef5b9a05fcc..0cf0f95b1af4 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3192,7 +3192,6 @@ static int handle_pmi_common(struct pt_regs *regs,
u64 status)
struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
int bit;
int handled = 0;
- u64 intel_ctrl = hybrid(cpuc->pmu, intel_ctrl);
inc_irq_stat(apic_perf_irqs);
@@ -3236,7 +3235,6 @@ static int handle_pmi_common(struct pt_regs *regs,
u64 status)
handled++;
x86_pmu_handle_guest_pebs(regs, &data);
static_call(x86_pmu_drain_pebs)(regs, &data);
- status &= intel_ctrl | GLOBAL_STATUS_TRACE_TOPAPMI;
/*
* PMI throttle may be triggered, which stops the PEBS event.
@@ -3269,12 +3267,17 @@ static int handle_pmi_common(struct pt_regs *regs,
u64 status)
/*
* Intel Perf metrics
+ * If PEBS counter group includes fix counter 3, PEBS handler would
update
+ * topdown events which is more accurate, it's unnecessary to
update again.
*/
- if (__test_and_clear_bit(GLOBAL_STATUS_PERF_METRICS_OVF_BIT,
(unsigned long *)&status)) {
+ if (__test_and_clear_bit(GLOBAL_STATUS_PERF_METRICS_OVF_BIT,
(unsigned long *)&status) &&
+
!is_pebs_counter_event_group(cpuc->events[INTEL_PMC_IDX_FIXED_SLOTS])) {
handled++;
static_call(intel_pmu_update_topdown_event)(NULL, NULL);
}
+ status &= hybrid(cpuc->pmu, intel_ctrl);
+
/*
* Checkpointed counters can lead to 'spurious' PMIs because the
* rollback caused by the PMI will have cleared the overflow status
>
> Thanks,
> Kan
>
>
Powered by blists - more mailing lists