lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 21 Jul 2020 10:23:36 -0400
From:   "Liang, Kan" <kan.liang@...ux.intel.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     acme@...hat.com, mingo@...nel.org, linux-kernel@...r.kernel.org,
        jolsa@...nel.org, eranian@...gle.com,
        alexander.shishkin@...ux.intel.com, ak@...ux.intel.com
Subject: Re: [PATCH V6 09/14] perf/x86/intel: Support TopDown metrics on Ice
 Lake



On 7/21/2020 8:40 AM, Peter Zijlstra wrote:
> On Fri, Jul 17, 2020 at 07:05:49AM -0700, kan.liang@...ux.intel.com wrote:
> 
>> +static inline u64 icl_get_metrics_event_value(u64 metric, u64 slots, int idx)
>> +{
>> +	u32 val;
>> +
>> +	/*
>> +	 * The metric is reported as an 8bit integer fraction
>> +	 * suming up to 0xff.
>> +	 * slots-in-metric = (Metric / 0xff) * slots
>> +	 */
>> +	val = (metric >> ((idx - INTEL_PMC_IDX_METRIC_BASE) * 8)) & 0xff;
>> +	return  mul_u64_u32_div(slots, val, 0xff);
>> +}
>> +
>> +static void __icl_update_topdown_event(struct perf_event *event,
>> +				       u64 slots, u64 metrics)
>> +{
>> +	int idx = event->hw.idx;
>> +	u64 delta;
>> +
>> +	if (is_metric_idx(idx))
>> +		delta = icl_get_metrics_event_value(metrics, slots, idx);
>> +	else
>> +		delta = slots;
>> +
>> +	local64_add(delta, &event->count);
>> +}
>> +
>> +/*
>> + * Update all active Topdown events.
>> + *
>> + * The PERF_METRICS and Fixed counter 3 are read separately. The values may be
>> + * modify by a NMI. PMU has to be disabled before calling this function.
>> + */
>> +static u64 icl_update_topdown_event(struct perf_event *event)
>> +{
>> +	struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
>> +	struct perf_event *other;
>> +	u64 slots, metrics;
>> +	int idx;
>> +
>> +	/* read Fixed counter 3 */
>> +	rdpmcl((3 | INTEL_PMC_FIXED_RDPMC_BASE), slots);
>> +	if (!slots)
>> +		return 0;
>> +
>> +	/* read PERF_METRICS */
>> +	rdpmcl(INTEL_PMC_FIXED_RDPMC_METRICS, metrics);
>> +
>> +	for_each_set_bit(idx, cpuc->active_mask, INTEL_PMC_IDX_TD_BE_BOUND + 1) {
>> +		if (!is_topdown_idx(idx))
>> +			continue;
>> +		other = cpuc->events[idx];
>> +		__icl_update_topdown_event(other, slots, metrics);
>> +	}
>> +
>> +	/*
>> +	 * Check and update this event, which may have been cleared
>> +	 * in active_mask e.g. x86_pmu_stop()
>> +	 */
>> +	if (event && !test_bit(event->hw.idx, cpuc->active_mask))
>> +		__icl_update_topdown_event(event, slots, metrics);
>> +
>> +	/*
>> +	 * Software is recommended to periodically clear both registers
>> +	 * in order to maintain accurate measurements, which is required for
>> +	 * certain scenarios that involve sampling metrics at high rates.
> 
> I'll maintain that that statement is clear as mud and therefore useless.
> 
>> +	 * Software should always write fixed counter 3 before write to
>> +	 * PERF_METRICS.
>> +	 */
>> +	wrmsrl(MSR_CORE_PERF_FIXED_CTR3, 0);
>> +	wrmsrl(MSR_PERF_METRICS, 0);
>> +
>> +	return slots;
>> +}
> 
> So in the normal case, this ends up calling into this function _5_
> times, once for each event. Now the first time, it will actually do
> something useful, but the other 4 times it's just wasting cycles.
> 
> Also, that for_each_set_bit() loop, trying to find the events to
> update...
> 
> Can't we, instead, make the SLOTS update advance 5 running counters in
> cpuc and feed the events off of that?
> 

Patch 13 forces the slots event to be part of a metric group. In patch 
7, for a metric group, we only update the values once with slots event.
I think the normal case mentioned above should not happen.

+	/* Only need to call update_topdown_event() once for group read. */
+	if ((cpuc->txn_flags & PERF_PMU_TXN_READ) &&
+	    !is_slots_event(event))
+		return;
+
+	perf_pmu_disable(event->pmu);
+	x86_pmu.update_topdown_event(event);
+	perf_pmu_enable(event->pmu);

Thanks,
Kan


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ