linux-kernel - Re: [PATCH V5] perf: qcom: Add L3 cache PMU driver

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <f6d5d652b2123a3ad85ef2b0758e4e4a@codeaurora.org>
Date:   Fri, 31 Mar 2017 11:33:47 -0400
From:   Agustin Vega-Frias <agustinv@...eaurora.org>
To:     Mark Rutland <mark.rutland@....com>
Cc:     linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
        Will Deacon <will.deacon@....com>,
        Peter Zijlstra <peterz@...radead.org>,
        Catalin Marinas <catalin.marinas@....com>,
        Ingo Molnar <mingo@...hat.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        timur@...eaurora.org, nleeder@...eaurora.org,
        agross@...eaurora.org, jcm@...hat.com, msalter@...hat.com,
        mlangsdo@...hat.com, ahs3@...hat.com
Subject: Re: [PATCH V5] perf: qcom: Add L3 cache PMU driver

Hey Mark,

On 2017-03-31 09:59, Mark Rutland wrote:
> Hi Agustin,
> 
> On Thu, Mar 23, 2017 at 05:03:17PM -0400, Agustin Vega-Frias wrote:
>> This adds a new dynamic PMU to the Perf Events framework to program
>> and control the L3 cache PMUs in some Qualcomm Technologies SOCs.
>> 
>> The driver supports a distributed cache architecture where the overall
>> cache for a socket is comprised of multiple slices each with its own 
>> PMU.
>> Access to each individual PMU is provided even though all CPUs share 
>> all
>> the slices. User space needs to aggregate to individual counts to 
>> provide
>> a global picture.
>> 
>> The driver exports formatting and event information to sysfs so it can
>> be used by the perf user space tools with the syntaxes:
>>    perf stat -a -e l3cache_0_0/read-miss/
>>    perf stat -a -e l3cache_0_0/event=0x21/
>> 
>> Signed-off-by: Agustin Vega-Frias <agustinv@...eaurora.org>
> 
> Thanks for respinning this; it looks good to me.
> 
>> +static int qcom_l3_cache__event_init(struct perf_event *event)
>> +{
>> +	struct l3cache_pmu *l3pmu = to_l3cache_pmu(event->pmu);
>> +	struct hw_perf_event *hwc = &event->hw;
>> +	struct perf_event *sibling;
> 
>> +	/*
>> +	 * We must NOT create groups containing mixed PMUs, although 
>> software
>> +	 * events are acceptable
>> +	 */
>> +	if (event->group_leader->pmu != event->pmu &&
>> +			!is_software_event(event->group_leader))
>> +		return -EINVAL;
>> +
>> +	list_for_each_entry(sibling, &event->group_leader->sibling_list,
>> +			group_entry)
>> +		if (sibling->pmu != event->pmu &&
>> +				!is_software_event(sibling))
>> +			return -EINVAL;
> 
> Sorry for spotting this so late, but I just  noticed that we don't
> validate the group isn't too large to ever fit in the HW, which is
> important for avoiding pointless work in perf_rotate_context() when the
> mux hrtimer callback occurs.
> 
> We fail to do that in other drivers, too, so that's something we should
> clean up more generally.
> 
> Here, I'd like to factor the group validation logic out into a helper:
> 
> #define event_num_counters(e)	(event_uses_long_counter(e) ? 2 : 1)
> 
> static bool qcom_l3_cache__validate_event_group(struct perf_event 
> *event)
> {
> 	struct perf_event *leader = event->group_leader;
> 	struct perf_event *sibling;
> 	int counters = 0;
> 
> 	/*
> 	 * We must NOT create groups containing mixed PMUs, although
> 	 * software.
> 	 */
> 	if (leader->pmu != event->pmu && !is_software_event(leader))
> 		return false;
> 
> 	counters = event_num_counters(event);
> 	counters += event_num_counters(leader);
> 
> 	list_for_each_entry(sibling, &leader->sibling_list, group_entry) {
> 		if (is_software_event(sibling))
> 			continue;
> 		if (sibling->pmu != event->pmu)
> 			return false;
> 		counters += event_num_counters(sibling);
> 	}
> 
> 	/*
> 	 * If the group requires more counters than the HW has, it
> 	 * cannot ever be scheduled.
> 	 */
> 	return counters <= L3_NUM_COUNTERS;
> }
> 
> ... where in qcom_l3_cache__event_init() we'd do:
> 
> 	if (!qcom_l3_cache__validate_event_group(event))
> 		return -EINVAL;
> 
> If you're happy with that, I can make that change when applying.

Sounds good. I'll also apply this change internally and test it out.
I'll send it as V6 if I get it ready today.

Thanks,
Agustin

-- 
Qualcomm Datacenter Technologies, Inc. on behalf of the Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a 
Linux Foundation Collaborative Project.