linux-kernel - Re: [PATCH V8 2/2] perf/x86/intel: Support PEBS counters snapshotting

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <698e3cf3-3579-4bd4-93e2-bbcda56a2a29@linux.intel.com>
Date: Tue, 14 Jan 2025 12:52:36 -0500
From: "Liang, Kan" <kan.liang@...ux.intel.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: mingo@...hat.com, acme@...nel.org, namhyung@...nel.org,
 irogers@...gle.com, adrian.hunter@...el.com, linux-kernel@...r.kernel.org,
 linux-perf-users@...r.kernel.org, ak@...ux.intel.com, eranian@...gle.com,
 dapeng1.mi@...ux.intel.com
Subject: Re: [PATCH V8 2/2] perf/x86/intel: Support PEBS counters snapshotting



On 2025-01-14 7:00 a.m., Peter Zijlstra wrote:
> On Mon, Jan 06, 2025 at 06:21:03AM -0800, kan.liang@...ux.intel.com wrote:
>> @@ -4059,6 +4087,12 @@ static int intel_pmu_hw_config(struct perf_event *event)
>>  		event->hw.flags |= PERF_X86_EVENT_PEBS_VIA_PT;
>>  	}
>>  
>> +	if ((event->attr.sample_type & PERF_SAMPLE_READ) &&
>> +	    (x86_pmu.intel_cap.pebs_format >= 6) &&
>> +	    is_sampling_event(event) &&
>> +	    event->attr.precise_ip)
>> +			event->group_leader->hw.flags |= PERF_X86_EVENT_PEBS_CNTR;
>> +
> 
> White space fail, easily fixed though.
> 
>>  	if ((event->attr.type == PERF_TYPE_HARDWARE) ||
>>  	    (event->attr.type == PERF_TYPE_HW_CACHE))
>>  		return 0;
>> @@ -4167,6 +4201,24 @@ static int intel_pmu_hw_config(struct perf_event *event)
>>  	return 0;
>>  }
>>  
>> +static int intel_pmu_schedule_events(struct cpu_hw_events *cpuc, int n, int *assign)
>> +{
>> +	struct perf_event *event;
>> +	int ret = x86_schedule_events(cpuc, n, assign);
>> +
>> +	if (ret)
>> +		return ret;
>> +
>> +	if (cpuc->is_fake)
>> +		return ret;
>> +
>> +	event = cpuc->event_list[n - 1];
>> +	if (event && is_pebs_counter_event_group(event))
>> +		intel_pmu_pebs_update_cfg(cpuc, n, assign);
>> +
>> +	return 0;
>> +}
> 
> This lit up the WTF'o'meter for a bit. This needs a comment at the very
> least, but I also hate how this relies on the core code never doing a
> transaction larger than a single group.
> 
> Furthermore, you can have multiple ->schedule_events() calls in a single
> pmu_disable() section, so why is schedule_events() the right place to do
> this?
> 
> Could it not happen that you add group-a, which has this PEBS_CNTR thing
> on, computes the fancy new pebs_data_cfg field. Then adds another event,
> which perturbs the counter placement, does not update the pebs_data_cfg
> and you're up a creek?
> > I would've thought that x86_pmu_enable() would be a better place for
> this -- that's the one place where everything is set up, right before it
> is made to go.

Yes, the x86_pmu_enable() seems a better place for the global setup.

> Only problem seems to be x86_pmu_enable_all() /
> x86_pmu.enable_all() isn't given the right information, but that should
> be fixable. Maybe clear cpuc->n_added after calling enable_all() ?


Not the x86_pmu_enable_all(). The setup should be done before
x86_pmu_start(), which update the MSR_PEBS_DATA_CFG.

I will send out a V9 to address it.

Thanks,
Kan

> 
> 
>> diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
>> index ba74e1198328..e36bfb95c2a3 100644
>> --- a/arch/x86/events/intel/ds.c
>> +++ b/arch/x86/events/intel/ds.c
>> @@ -1308,10 +1308,63 @@ static void adaptive_pebs_record_size_update(void)
>>  		sz += sizeof(struct pebs_xmm);
>>  	if (pebs_data_cfg & PEBS_DATACFG_LBRS)
>>  		sz += x86_pmu.lbr_nr * sizeof(struct lbr_entry);
>> +	if (pebs_data_cfg & (PEBS_DATACFG_METRICS | PEBS_DATACFG_CNTR)) {
>> +		sz += sizeof(struct pebs_cntr_header);
>> +
>> +		/* Metrics base and Metrics Data */
>> +		if (pebs_data_cfg & PEBS_DATACFG_METRICS)
>> +			sz += 2 * sizeof(u64);
>> +
>> +		if (pebs_data_cfg & PEBS_DATACFG_CNTR) {
>> +			sz += hweight64((pebs_data_cfg >> PEBS_DATACFG_CNTR_SHIFT) & PEBS_DATACFG_CNTR_MASK)
>> +			      * sizeof(u64);
>> +			sz += hweight64((pebs_data_cfg >> PEBS_DATACFG_FIX_SHIFT) & PEBS_DATACFG_FIX_MASK)
>> +			      * sizeof(u64);
> 
> blergh, when splitting lines the operator goes on the end of the last
> line. These lines are too long anyway.
> 
> Maybe:
> 
> #define PEBS_DATACFG_CNTR(x) \
> 	((x >> PEBS_DATACFG_CNTR_SHIFT) & PEBS_DATACFG_CNTR_MASK)
> #define PEBS_DATACFG_FIX(x) \
> 	((x >> PEBS_DATACFG_FIX_SHIFT) & PEBS_DATACFG_FIX_MASK)
> 
> 			sz += (hweight64(PEBS_DATACFG_CNTR(pebs_data_cfg)) +
> 			       hweight64(PEBS_DATACFG_FIX(pebs_data_cfg)))) *
> 			      sizeof(u64);
> 
>> +		}
>> +	}
>>  
>>  	cpuc->pebs_record_size = sz;
>>  }
>>  
>> +static void __intel_pmu_pebs_update_cfg(struct perf_event *event,
>> +					int idx, u64 *pebs_data_cfg)
>> +{
>> +	if (is_metric_event(event)) {
>> +		*pebs_data_cfg |= PEBS_DATACFG_METRICS;
>> +		return;
>> +	}
>> +
>> +	*pebs_data_cfg |= PEBS_DATACFG_CNTR;
>> +
>> +	if (idx >= INTEL_PMC_IDX_FIXED) {
>> +		*pebs_data_cfg |= ((1ULL << (idx - INTEL_PMC_IDX_FIXED)) & PEBS_DATACFG_FIX_MASK)
>> +				  << PEBS_DATACFG_FIX_SHIFT;
>> +	} else {
>> +		*pebs_data_cfg |= ((1ULL << idx) & PEBS_DATACFG_CNTR_MASK)
>> +				  << PEBS_DATACFG_CNTR_SHIFT;
> 
> Also yuck. Maybe:
> 
> #define PEBS_DATACFG_FIX_BIT(x) \
> 	(((1ULL << x) & PEBS_DATACFG_FIX_MASK) << PEBS_DATACFG_FIX_SHIFT)
> 
> 
> 	pebs_data_cfg |= PEBS_DATACFG_FIX_BIT(idx - INTEL_PMC_IDX_FIXED);
> 
> 
> 
>> +	}
>> +}
>> +
> 
> 
>