linux-kernel - Re: [PATCH v7] soc: qcom: add l2 cache perf events driver

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Fri, 11 Nov 2016 16:52:35 -0500
From:   "Leeder, Neil" <nleeder@...eaurora.org>
To:     Will Deacon <will.deacon@....com>,
        Mark Rutland <mark.rutland@....com>
Cc:     Catalin Marinas <catalin.marinas@....com>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        linux-arm-msm@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org,
        Mark Langsdorf <mlangsdo@...hat.com>,
        Mark Salter <msalter@...hat.com>, Jon Masters <jcm@...hat.com>,
        Timur Tabi <timur@...eaurora.org>, cov@...eaurora.org,
        nleeder@...eaurora.org
Subject: Re: [PATCH v7] soc: qcom: add l2 cache perf events driver

Hi Will,

On 11/9/2016 1:16 PM, Will Deacon wrote:
> On Wed, Nov 09, 2016 at 05:54:13PM +0000, Mark Rutland wrote:
>> On Fri, Oct 28, 2016 at 04:50:13PM -0400, Neil Leeder wrote:
>>> +	struct perf_event *events[MAX_L2_CTRS];
>>> +	struct l2cache_pmu *l2cache_pmu;
>>> +	DECLARE_BITMAP(used_counters, MAX_L2_CTRS);
>>> +	DECLARE_BITMAP(used_groups, L2_EVT_GROUP_MAX + 1);
>>> +	int group_to_counter[L2_EVT_GROUP_MAX + 1];
>>> +	int irq;
>>> +	/* The CPU that is used for collecting events on this cluster */
>>> +	int on_cpu;
>>> +	/* All the CPUs associated with this cluster */
>>> +	cpumask_t cluster_cpus;
>>
>> I'm still uncertain about aggregating all cluster PMUs into a larger
>> PMU, given the cluster PMUs are logically independent (at least in terms
>> of the programming model).
>>
>> However, from what I understand the x86 uncore PMU drivers aggregate
>> symmetric instances of uncore PMUs (and also aggregate across packages
>> to the same logical PMU).
>>
>> Whatever we do, it would be nice for the uncore drivers to align on a
>> common behaviour (and I think we're currently going the oppposite route
>> with Cavium's uncore PMU). Will, thoughts?
>
> I'm not a big fan of aggregating this stuff. Ultimately, the user in the
> driving seat of perf is going to need some knowledge about the toplogy of
> the system in order to perform sensible profiling using an uncore PMU.
> If the kernel tries to present a single, unified PMU then we paint ourselves
> into a corner when the hardware isn't as symmetric as we want it to be
> (big/little on the CPU side is the extreme example of this). If we want
> to be consistent, then exposing each uncore unit as a separate PMU is
> the way to go. That doesn't mean we can't aggregate the components of a
> distributed PMU (e.g. the CCN or the SMMU), but we don't want to aggregate
> at the programming interface/IP block level.
>
> We could consider exposing some topology information in sysfs if that's
> seen as an issue with the non-aggregated case.
>
> Will

So is there a use-case for individual uncore PMUs when they can't be 
used in task mode or per-cpu?

The main (only?) use will be in system mode, in which case surely it 
makes sense to provide a single aggregated count?

With individual PMUs exposed there will be potentially dozens of nodes 
for userspace to collect from which would make perf command-line usage 
unwieldy at best.

Neil
-- 
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.