lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 29 Jun 2018 15:29:12 +0100
From:   Suzuki K Poulose <Suzuki.Poulose@....com>
To:     Mark Rutland <mark.rutland@....com>
Cc:     linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
        will.deacon@....com, robin.murphy@....com, julien.thierry@....com
Subject: Re: [PATCH v3 7/7] arm64: perf: Add support for chaining event
 counters

On 29/06/18 15:01, Mark Rutland wrote:
> On Tue, Jun 19, 2018 at 11:15:42AM +0100, Suzuki K Poulose wrote:
>> Add support for 64bit event by using chained event counters
>> and 64bit cycle counters.
>>
>> PMUv3 allows chaining a pair of adjacent 32-bit counters, effectively
>> forming a 64-bit counter. The low/even counter is programmed to count
>> the event of interest, and the high/odd counter is programmed to count
>> the CHAIN event, taken when the low/even counter overflows.
>>
>> For CPU cycles, when 64bit mode is requested, the cycle counter
>> is used in 64bit mode. If the cycle counter is not available,
>> falls back to chaining.
>>
>> Cc: Mark Rutland <mark.rutland@....com>
>> Cc: Will Deacon <will.deacon@....com>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@....com>
>> ---
>> Changes sinec v2:
>>   - Drop special allocation algorithm for chain indices
>>   - Since we access the counters when the PMU is stopped,
>>     get rid of the unncessary barriers.
>>   - Ensure a counter is allocated when checking for chained event
>> ---
>>   arch/arm64/kernel/perf_event.c | 184 ++++++++++++++++++++++++++++++++++++-----
>>   drivers/perf/arm_pmu.c         |  13 ++-
>>   2 files changed, 169 insertions(+), 28 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
>> index eebc635..a03def7 100644
>> --- a/arch/arm64/kernel/perf_event.c
>> +++ b/arch/arm64/kernel/perf_event.c
>> @@ -446,9 +446,16 @@ static struct attribute_group armv8_pmuv3_events_attr_group = {
>>   };
>>   
>>   PMU_FORMAT_ATTR(event, "config:0-15");
>> +PMU_FORMAT_ATTR(bits64, "config1:0");
> 
> I'm not too keen on the "bits64" name here -- it's a little unusual.
> Perhaps "long"?

I wasn't either. The other option was _64bit, but it is easier to misspell.
The only reason for not sticking to "long" is, that gives a perception that
the user always get "double" the normal counter width, which may break
if we ever get 64bit PMU counters by default without chaining.

> 
>> +
>> +static inline bool armv8pmu_event_is_64bit(struct perf_event *event)
>> +{
>> +	return event->attr.config1 & 0x1;
>> +}
>>   
>>   static struct attribute *armv8_pmuv3_format_attrs[] = {
>>   	&format_attr_event.attr,
>> +	&format_attr_bits64.attr,
>>   	NULL,
>>   };
>>   
>> @@ -466,6 +473,20 @@ static struct attribute_group armv8_pmuv3_format_attr_group = {
>>   	(ARMV8_IDX_CYCLE_COUNTER + cpu_pmu->num_events - 1)
>>   
>>   /*
>> + * Use chained counter for a 64bit event, if we could not allocate
>> + * the 64bit cycle counter. This must be called after a counter
>> + * was allocated.
>> + */
>> +static inline bool armv8pmu_event_is_chained(struct perf_event *event)
>> +{
>> +	int idx = event->hw.idx;
>> +
>> +	return !WARN_ON(idx < 0) &&
>> +	       armv8pmu_event_is_64bit(event) &&
>> +	       (event->hw.idx != ARMV8_IDX_CYCLE_COUNTER);
>> +}
> 
> It took me a moment to parse this. Perhaps:

Yes, it does look a bit weird.

> 
> 	/*
> 	 * The dedicated cycle counter doesn't requrie chaining, but
> 	 * when this is already in use, we must chain two programmable
> 	 * counters to form a 64-bit cycle counter.
> 	 */

That sounds like, we check only for cycle events, which is not true, how
about :

	/*
	 * We must chain two programmable counters for 64 bit events,
	 * except when we have allocated the 64bit cycle counter (for CPU
	 *cycles event).
	 */


> 
>> +
>> +/*
>>    * ARMv8 low level PMU access
>>    */
>>   
>> @@ -516,12 +537,28 @@ static inline u32 armv8pmu_read_evcntr(int idx)
>>   	return read_sysreg(pmxevcntr_el0);
>>   }
>>   
>> +static inline u64 armv8pmu_read_hw_counter(struct perf_event *event)
>> +{
>> +	int idx = event->hw.idx;
>> +	u64 val = 0;
>> +
>> +	val = armv8pmu_read_evcntr(idx);
>> +	/*
>> +	 * We always read the counter with the PMU turned off.
>> +	 * So we don't need special care for reading chained
>> +	 * counters.
>> +	 */
> 
> I think this comment can go.

OK

>> +	if (armv8pmu_event_is_chained(event))
>> +		val = (val << 32) | armv8pmu_read_evcntr(idx - 1);
>> +	return val;
>> +}
>> +
>>   static inline u64 armv8pmu_read_counter(struct perf_event *event)
>>   {
>>   	struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
>>   	struct hw_perf_event *hwc = &event->hw;
>>   	int idx = hwc->idx;
>> -	u32 value = 0;
>> +	u64 value = 0;
>>   
>>   	if (!armv8pmu_counter_valid(cpu_pmu, idx))
>>   		pr_err("CPU%u reading wrong counter %d\n",
>> @@ -529,7 +566,7 @@ static inline u64 armv8pmu_read_counter(struct perf_event *event)
>>   	else if (idx == ARMV8_IDX_CYCLE_COUNTER)
>>   		value = read_sysreg(pmccntr_el0);
>>   	else
>> -		value = armv8pmu_read_evcntr(idx);
>> +		value = armv8pmu_read_hw_counter(event);
>>   
>>   	return value;
>>   }
>> @@ -540,6 +577,19 @@ static inline void armv8pmu_write_evcntr(int idx, u32 value)
>>   	write_sysreg(value, pmxevcntr_el0);
>>   }
>>   
>> +static inline void armv8pmu_write_hw_counter(struct perf_event *event,
>> +					     u64 value)
>> +{
>> +	int idx = event->hw.idx;
>> +
>> +	if (armv8pmu_event_is_chained(event)) {
>> +		armv8pmu_write_evcntr(idx, upper_32_bits(value));
>> +		armv8pmu_write_evcntr(idx - 1, lower_32_bits(value));
>> +	} else {
>> +		armv8pmu_write_evcntr(idx, value);
>> +	}
>> +}
>> +
>>   static inline void armv8pmu_write_counter(struct perf_event *event, u64 value)
>>   {
>>   	struct arm_pmu *cpu_pmu = to_arm_pmu(event->pmu);
>> @@ -551,14 +601,15 @@ static inline void armv8pmu_write_counter(struct perf_event *event, u64 value)
>>   			smp_processor_id(), idx);
>>   	else if (idx == ARMV8_IDX_CYCLE_COUNTER) {
>>   		/*
>> -		 * Set the upper 32bits as this is a 64bit counter but we only
>> +		 * Set the upper 32bits as this is a 64bit counter, if we only
>>   		 * count using the lower 32bits and we want an interrupt when
>>   		 * it overflows.
>>   		 */
> 
> Let's reword this as:
> 
> 		/*
> 		 * The cycles counter is really a 64-bit counter.
> 		 * When treating it as a 32-bit counter, we only count
> 		 * the lower 32 bits, and set the upper 32-bits so that
> 		 * we get an interrupt upon 32-bit overflow.
> 		 */
> 

OK

...

>> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
>> index 6e10e8c..a4675e4 100644
>> --- a/drivers/perf/arm_pmu.c
>> +++ b/drivers/perf/arm_pmu.c
>> @@ -674,14 +674,13 @@ static void cpu_pm_pmu_setup(struct arm_pmu *armpmu, unsigned long cmd)
>>   	int idx;
>>   
>>   	for (idx = 0; idx < armpmu->num_events; idx++) {
>> -		/*
>> -		 * If the counter is not used skip it, there is no
>> -		 * need of stopping/restarting it.
>> -		 */
>> -		if (!test_bit(idx, hw_events->used_mask))
>> -			continue;
>> -
>>   		event = hw_events->events[idx];
>> +		/*
>> +		 * If there is no event at this idx (e.g, an idx used
>> +		 * by a chained event in Arm v8 PMUv3), skip it.
>> +		 */
>> +		if (!event)
>> +			continue;
>>   
>>   		switch (cmd) {
>>   		case CPU_PM_ENTER:
> 
> Please make this change an individual patch earlier in the series. It's

I can do that. But I don't think the krait needs this, as explained in the
other patch.

> a bug fix needed for krait, too.

Thanks for the review.

Suzuki

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ