[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CBE3EBD5-EAC6-41B6-88C1-B15958591172@fb.com>
Date: Fri, 28 Sep 2018 04:53:52 +0000
From: Song Liu <songliubraving@...com>
To: Ravi Bangoria <ravi.bangoria@...ux.ibm.com>
CC: lkml <linux-kernel@...r.kernel.org>,
Kernel Team <Kernel-team@...com>, Tejun Heo <tj@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Jiri Olsa <jolsa@...nel.org>,
Alexey Budankov <alexey.budankov@...ux.intel.com>,
"Naveen N. Rao" <naveen.n.rao@...ux.vnet.ibm.com>,
Madhavan Srinivasan <maddy@...ux.vnet.ibm.com>
Subject: Re: [PATCH v3 1/1] perf: Sharing PMU counters across compatible
events
Hi Ravi,
> On Sep 27, 2018, at 9:33 PM, Ravi Bangoria <ravi.bangoria@...ux.ibm.com> wrote:
>
> Hi Song,
>
> On 09/25/2018 03:55 AM, Song Liu wrote:
>> This patch tries to enable PMU sharing. To make perf event scheduling
>> fast, we use special data structures.
>>
>> An array of "struct perf_event_dup" is added to the perf_event_context,
>> to remember all the duplicated events under this ctx. All the events
>> under this ctx has a "dup_id" pointing to its perf_event_dup. Compatible
>> events under the same ctx share the same perf_event_dup. The following
>> figure shows a simplified version of the data structure.
>>
>> ctx -> perf_event_dup -> master
>> ^
>> |
>> perf_event /|
>> |
>> perf_event /
>>
>
> I've not gone through the patch in detail, but I was specifically
> interested in scenarios where one perf instance is counting event
> systemwide and thus other perf instance fails to count the same
> event for a specific workload because that event can be counted
> in one hw counter only.
>
> Ex: https://lkml.org/lkml/2018/3/12/1011
>
> Seems this patch does not solve this issue. Please let me know if
> I'm missing anything.
>
In this case, unfortunately, these two events cannot share the same
counter, because one of them is in cpu ctx; while the other belongs
to the task ctx. They have to go through the rotation, that each
event counts 50% of the time. However, if you have 2 events in cpu
ctx and 2 events in task ctx on the same counter, this patch will
help each event to count 50% of time, instead of 25%.
Another potential solution is to create a cgroup for the workload,
and attach perf event to the cgroup. Since cgroup events are added
to the cpu ctx, they can share counters with the system wide events.
I made this trade-off for O(1) time context switch. If we share
hw counter between cpu ctx and task ctx, we have to do linear time
comparison to identify events that can share the counter.
Thanks,
Song
Powered by blists - more mailing lists