[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <698e3cf3-3579-4bd4-93e2-bbcda56a2a29@linux.intel.com>
Date: Tue, 14 Jan 2025 12:52:36 -0500
From: "Liang, Kan" <kan.liang@...ux.intel.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: mingo@...hat.com, acme@...nel.org, namhyung@...nel.org,
irogers@...gle.com, adrian.hunter@...el.com, linux-kernel@...r.kernel.org,
linux-perf-users@...r.kernel.org, ak@...ux.intel.com, eranian@...gle.com,
dapeng1.mi@...ux.intel.com
Subject: Re: [PATCH V8 2/2] perf/x86/intel: Support PEBS counters snapshotting
On 2025-01-14 7:00 a.m., Peter Zijlstra wrote:
> On Mon, Jan 06, 2025 at 06:21:03AM -0800, kan.liang@...ux.intel.com wrote:
>> @@ -4059,6 +4087,12 @@ static int intel_pmu_hw_config(struct perf_event *event)
>> event->hw.flags |= PERF_X86_EVENT_PEBS_VIA_PT;
>> }
>>
>> + if ((event->attr.sample_type & PERF_SAMPLE_READ) &&
>> + (x86_pmu.intel_cap.pebs_format >= 6) &&
>> + is_sampling_event(event) &&
>> + event->attr.precise_ip)
>> + event->group_leader->hw.flags |= PERF_X86_EVENT_PEBS_CNTR;
>> +
>
> White space fail, easily fixed though.
>
>> if ((event->attr.type == PERF_TYPE_HARDWARE) ||
>> (event->attr.type == PERF_TYPE_HW_CACHE))
>> return 0;
>> @@ -4167,6 +4201,24 @@ static int intel_pmu_hw_config(struct perf_event *event)
>> return 0;
>> }
>>
>> +static int intel_pmu_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
>> +{
>> + struct perf_event *event;
>> + int ret = x86_schedule_events(cpuc, n, assign);
>> +
>> + if (ret)
>> + return ret;
>> +
>> + if (cpuc->is_fake)
>> + return ret;
>> +
>> + event = cpuc->event_list[n - 1];
>> + if (event && is_pebs_counter_event_group(event))
>> + intel_pmu_pebs_update_cfg(cpuc, n, assign);
>> +
>> + return 0;
>> +}
>
> This lit up the WTF'o'meter for a bit. This needs a comment at the very
> least, but I also hate how this relies on the core code never doing a
> transaction larger than a single group.
>
> Furthermore, you can have multiple ->schedule_events() calls in a single
> pmu_disable() section, so why is schedule_events() the right place to do
> this?
>
> Could it not happen that you add group-a, which has this PEBS_CNTR thing
> on, computes the fancy new pebs_data_cfg field. Then adds another event,
> which perturbs the counter placement, does not update the pebs_data_cfg
> and you're up a creek?
> > I would've thought that x86_pmu_enable() would be a better place for
> this -- that's the one place where everything is set up, right before it
> is made to go.
Yes, the x86_pmu_enable() seems a better place for the global setup.
> Only problem seems to be x86_pmu_enable_all() /
> x86_pmu.enable_all() isn't given the right information, but that should
> be fixable. Maybe clear cpuc->n_added after calling enable_all() ?
Not the x86_pmu_enable_all(). The setup should be done before
x86_pmu_start(), which update the MSR_PEBS_DATA_CFG.
I will send out a V9 to address it.
Thanks,
Kan
>
>
>> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
>> index ba74e1198328..e36bfb95c2a3 100644
>> --- a/arch/x86/events/intel/ds.c
>> +++ b/arch/x86/events/intel/ds.c
>> @@ -1308,10 +1308,63 @@ static void adaptive_pebs_record_size_update(void)
>> sz += sizeof(struct pebs_xmm);
>> if (pebs_data_cfg & PEBS_DATACFG_LBRS)
>> sz += x86_pmu.lbr_nr * sizeof(struct lbr_entry);
>> + if (pebs_data_cfg & (PEBS_DATACFG_METRICS | PEBS_DATACFG_CNTR)) {
>> + sz += sizeof(struct pebs_cntr_header);
>> +
>> + /* Metrics base and Metrics Data */
>> + if (pebs_data_cfg & PEBS_DATACFG_METRICS)
>> + sz += 2 * sizeof(u64);
>> +
>> + if (pebs_data_cfg & PEBS_DATACFG_CNTR) {
>> + sz += hweight64((pebs_data_cfg >> PEBS_DATACFG_CNTR_SHIFT) & PEBS_DATACFG_CNTR_MASK)
>> + * sizeof(u64);
>> + sz += hweight64((pebs_data_cfg >> PEBS_DATACFG_FIX_SHIFT) & PEBS_DATACFG_FIX_MASK)
>> + * sizeof(u64);
>
> blergh, when splitting lines the operator goes on the end of the last
> line. These lines are too long anyway.
>
> Maybe:
>
> #define PEBS_DATACFG_CNTR(x) \
> ((x >> PEBS_DATACFG_CNTR_SHIFT) & PEBS_DATACFG_CNTR_MASK)
> #define PEBS_DATACFG_FIX(x) \
> ((x >> PEBS_DATACFG_FIX_SHIFT) & PEBS_DATACFG_FIX_MASK)
>
> sz += (hweight64(PEBS_DATACFG_CNTR(pebs_data_cfg)) +
> hweight64(PEBS_DATACFG_FIX(pebs_data_cfg)))) *
> sizeof(u64);
>
>> + }
>> + }
>>
>> cpuc->pebs_record_size = sz;
>> }
>>
>> +static void __intel_pmu_pebs_update_cfg(struct perf_event *event,
>> + int idx, u64 *pebs_data_cfg)
>> +{
>> + if (is_metric_event(event)) {
>> + *pebs_data_cfg |= PEBS_DATACFG_METRICS;
>> + return;
>> + }
>> +
>> + *pebs_data_cfg |= PEBS_DATACFG_CNTR;
>> +
>> + if (idx >= INTEL_PMC_IDX_FIXED) {
>> + *pebs_data_cfg |= ((1ULL << (idx - INTEL_PMC_IDX_FIXED)) & PEBS_DATACFG_FIX_MASK)
>> + << PEBS_DATACFG_FIX_SHIFT;
>> + } else {
>> + *pebs_data_cfg |= ((1ULL << idx) & PEBS_DATACFG_CNTR_MASK)
>> + << PEBS_DATACFG_CNTR_SHIFT;
>
> Also yuck. Maybe:
>
> #define PEBS_DATACFG_FIX_BIT(x) \
> (((1ULL << x) & PEBS_DATACFG_FIX_MASK) << PEBS_DATACFG_FIX_SHIFT)
>
>
> pebs_data_cfg |= PEBS_DATACFG_FIX_BIT(idx - INTEL_PMC_IDX_FIXED);
>
>
>
>> + }
>> +}
>> +
>
>
>
Powered by blists - more mailing lists