[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c1450cf4-f367-4675-9f5e-90416a996af1@linux.intel.com>
Date: Wed, 26 Feb 2025 13:20:37 +0800
From: "Mi, Dapeng" <dapeng1.mi@...ux.intel.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...hat.com>, Arnaldo Carvalho de Melo
<acme@...nel.org>, Namhyung Kim <namhyung@...nel.org>,
Ian Rogers <irogers@...gle.com>, Adrian Hunter <adrian.hunter@...el.com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Kan Liang <kan.liang@...ux.intel.com>, Andi Kleen <ak@...ux.intel.com>,
Eranian Stephane <eranian@...gle.com>, linux-kernel@...r.kernel.org,
linux-perf-users@...r.kernel.org, Dapeng Mi <dapeng1.mi@...el.com>
Subject: Re: [Patch v2 10/24] perf/x86/intel: Process arch-PEBS records or
record fragments
On 2/25/2025 7:00 PM, Peter Zijlstra wrote:
> On Tue, Feb 25, 2025 at 11:39:27AM +0100, Peter Zijlstra wrote:
>> On Tue, Feb 18, 2025 at 03:28:04PM +0000, Dapeng Mi wrote:
>>> A significant difference with adaptive PEBS is that arch-PEBS record
>>> supports fragments which means an arch-PEBS record could be split into
>>> several independent fragments which have its own arch-PEBS header in
>>> each fragment.
>>>
>>> This patch defines architectural PEBS record layout structures and add
>>> helpers to process arch-PEBS records or fragments. Only legacy PEBS
>>> groups like basic, GPR, XMM and LBR groups are supported in this patch,
>>> the new added YMM/ZMM/OPMASK vector registers capturing would be
>>> supported in subsequent patches.
>>>
>>> Signed-off-by: Dapeng Mi <dapeng1.mi@...ux.intel.com>
>>> ---
>>> arch/x86/events/intel/core.c | 9 ++
>>> arch/x86/events/intel/ds.c | 219 ++++++++++++++++++++++++++++++
>>> arch/x86/include/asm/msr-index.h | 6 +
>>> arch/x86/include/asm/perf_event.h | 100 ++++++++++++++
>>> 4 files changed, 334 insertions(+)
>>>
>>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>>> index 37540eb80029..184f69afde08 100644
>>> --- a/arch/x86/events/intel/core.c
>>> +++ b/arch/x86/events/intel/core.c
>>> @@ -3124,6 +3124,15 @@ static int handle_pmi_common(struct pt_regs *regs, u64 status)
>>> wrmsrl(MSR_IA32_PEBS_ENABLE, cpuc->pebs_enabled);
>>> }
>>>
>>> + /*
>>> + * Arch PEBS sets bit 54 in the global status register
>>> + */
>>> + if (__test_and_clear_bit(GLOBAL_STATUS_ARCH_PEBS_THRESHOLD_BIT,
>>> + (unsigned long *)&status)) {
>> Will arch_pebs hardware ever toggle bit 62?
> This had me looking at the bit 62 handling, and I noticed the thing from
> commit 8077eca079a2 ("perf/x86/pebs: Add workaround for broken OVFL
> status on HSW+").
>
> Did that ever get fixed in later chips; notably I'm assuming ARCH PEBS
> does not suffer this?
I'm not sure if arch-PEBS would still suffer this race condition issue, but
the status cleaning for PEBS counters had been move to out of the 62 bit
handling scope by the commit daa864b8f8e3 ("perf/x86/pebs: Fix handling of
PEBS buffer overflows"). So even arch-PEBS still have this race condition
issue, it still can be covered.
>
> Also, should that workaround have been extended to also include
> GLOBAL_STATUS_PERF_METRICS_OVF in that mask, or was that defect fixed
> for every chip capable of metrics stuff?
hmm, per my understanding, GLOBAL_STATUS_PERF_METRICS_OVF handling should
only be skipped when fixed counter 3 or perf metrics are included in PEBS
counter group. In this case, the slots and topdown metrics have been
updated by PEBS handler. It should not be processed again.
@Kan Liang, is it correct?
>
> In any case, I think we want a patch clarifying the situation with a
> comment.
>
>
>>> + handled++;
>>> + x86_pmu.drain_pebs(regs, &data);
>> static_call(x86_pmu_drain_pebs)(regs, &data);
>>
>>> + }
Powered by blists - more mailing lists