[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240624082138.GA31592@noisy.programming.kicks-ass.net>
Date: Mon, 24 Jun 2024 10:21:38 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: "Liang, Kan" <kan.liang@...ux.intel.com>
Cc: mingo@...nel.org, acme@...nel.org, namhyung@...nel.org,
irogers@...gle.com, adrian.hunter@...el.com,
alexander.shishkin@...ux.intel.com, linux-kernel@...r.kernel.org,
ak@...ux.intel.com, eranian@...gle.com
Subject: Re: [RESEND PATCH 01/12] perf/x86/intel: Support the PEBS event mask
On Thu, Jun 20, 2024 at 11:58:42AM -0400, Liang, Kan wrote:
>
>
> On 2024-06-20 3:02 a.m., Peter Zijlstra wrote:
> > On Tue, Jun 18, 2024 at 08:10:33AM -0700, kan.liang@...ux.intel.com wrote:
> >> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
> >> index e010bfed8417..a0104c82baed 100644
> >> --- a/arch/x86/events/intel/ds.c
> >> +++ b/arch/x86/events/intel/ds.c
> >
> >> @@ -2157,6 +2157,7 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
> >> void *base, *at, *top;
> >> short counts[INTEL_PMC_IDX_FIXED + MAX_FIXED_PEBS_EVENTS] = {};
> >> short error[INTEL_PMC_IDX_FIXED + MAX_FIXED_PEBS_EVENTS] = {};
> >> + int max_pebs_events = hweight64(x86_pmu.pebs_events_mask);
> >
> > Consider pebs_events_mask = 0xf0, then max_pebs_events becomes 4.
>
> The intel_pmu_drain_pebs_nhm() is a NHM specific function. It's
> impossible that there is a pebs_events_mask = 0xf0.
>
> There are only 4 counters. The mask should always be 0xf.
> >
> >> int bit, i, size;
> >> u64 mask;
> >>
> >> @@ -2168,8 +2169,8 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
> >>
> >> ds->pebs_index = ds->pebs_buffer_base;
> >>
> >> - mask = (1ULL << x86_pmu.max_pebs_events) - 1;
> >> - size = x86_pmu.max_pebs_events;
> >> + mask = x86_pmu.pebs_events_mask;
> >> + size = max_pebs_events;
> >
> > This is wrong.. you have 8 bits to iterate, of which only the top 4 are
> > set.
> >
> >> if (x86_pmu.flags & PMU_FL_PEBS_ALL) {
> >> mask |= ((1ULL << x86_pmu.num_counters_fixed) - 1) << INTEL_PMC_IDX_FIXED;
> >> size = INTEL_PMC_IDX_FIXED + x86_pmu.num_counters_fixed;
> >> @@ -2208,8 +2209,8 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, struct perf_sample_d
> >> pebs_status = p->status = cpuc->pebs_enabled;
> >>
> >> bit = find_first_bit((unsigned long *)&pebs_status,
> >> - x86_pmu.max_pebs_events);
> >> - if (bit >= x86_pmu.max_pebs_events)
> >> + max_pebs_events);
> >> + if (bit >= max_pebs_events)
> >> continue;
> >
> > But the bit check here then truncates the thing to the lower 4 bits
> > which are all 0.
> >
> > Should this not be something like:
> >
> > if (!(pebs_event_mask & (1 << bit)))
> > continue;
> >
> > ?
> >
>
> For the existing hardware, I don't think it's necessary. The counters
> are contiguous.
Sure, but it still makes no sense. Fundamentally the operation is wrong,
it works by accident in this one special case. The moment someone reuses
the code (or copy/pastes) it and it goes outside the special case it
goes to hell.
Also, anybody that actually understands bitops will have a giant WTF
when they read this -- which is what happened here.
Powered by blists - more mailing lists