[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4611984d-0d77-1584-3011-768c80c261af@linux.intel.com>
Date: Tue, 20 Jun 2017 20:10:06 +0300
From: Alexey Budankov <alexey.budankov@...ux.intel.com>
To: Mark Rutland <mark.rutland@....com>
Cc: Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Andi Kleen <ak@...ux.intel.com>,
Kan Liang <kan.liang@...el.com>,
Dmitri Prokhorov <Dmitry.Prohorov@...el.com>,
Valery Cherepennikov <valery.cherepennikov@...el.com>,
David Carrillo-Cisneros <davidcc@...gle.com>,
Stephane Eranian <eranian@...gle.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3 1/n] perf/core: addressing 4x slowdown during
per-process profiling of STREAM benchmark on Intel Xeon Phi
On 20.06.2017 19:37, Mark Rutland wrote:
> On Tue, Jun 20, 2017 at 06:22:56PM +0300, Alexey Budankov wrote:
>> On 20.06.2017 16:36, Mark Rutland wrote:
>>> On Mon, Jun 19, 2017 at 11:31:59PM +0300, Alexey Budankov wrote:
>>>> On 15.06.2017 22:56, Mark Rutland wrote:
>>>>> On Thu, Jun 15, 2017 at 08:41:42PM +0300, Alexey Budankov wrote:
>>>>>> +static int
>>>>>> +perf_cpu_tree_iterate(struct rb_root *tree,
>>>>>> + perf_cpu_tree_callback_t callback, void *data)
>>>>>> +{
>>>>>> + int ret = 0;
>>>>>> + struct rb_node *node;
>>>>>> + struct perf_event *event;
>>>>>> +
>>>>>> + WARN_ON_ONCE(!tree);
>>>>>> +
>>>>>> + for (node = rb_first(tree); node; node = rb_next(node)) {
>>>>>> + struct perf_event *node_event = container_of(node,
>>>>>> + struct perf_event, group_node);
>>>>>> +
>>>>>> + list_for_each_entry(event, &node_event->group_list,
>>>>>> + group_list_entry) {
>>>>>> + ret = callback(event, data);
>>>>>> + if (ret)
>>>>>> + return ret;
>>>>>> + }
>>>>>> + }
>>>>>> +
>>>>>> + return 0;
>>>>>> }
>>>>>
>>>>> If you need to iterate over every event, you can use the list that
>>>>> threads the whole tree.
>>>>
>>>> Could you please explain more on that?
>>>
>>> In Peter's original suggestion, we'd use a threaded tree rather than a
>>> tree of lists.
>>>
>>> i.e. you'd have something like:
>>>
>>> struct threaded_rb_node {
>>> struct rb_node node;
>>> struct list_head head;
>>> };
>>
>> Is this for every group leader?
>
> Yes; *every* group leader would be directly in the threaded rb tree.
In this case the tree's key heeds to be something trickier than just
event->cpu. To avoid that complication group_list is introduced. BTW,
addressing perf_event_tree_delete issue doesn't look like a big change now:
static void
perf_cpu_tree_delete(struct rb_root *tree, struct perf_event *event)
{
struct perf_event *next;
WARN_ON_ONCE(!tree || !event);
list_del_init(&event->group_entry);
if (!RB_EMPTY_NODE(&event->group_node)) {
if (!list_empty(&event->group_list)) {
next = list_first_entry(&event->group_list,
struct perf_event, group_entry);
list_replace_init(&event->group_list,
&next->group_list);
rb_replace_node(&event->group_node,
&next->group_node, tree);
} else {
rb_erase(&event->group_node, tree);
}
RB_CLEAR_NODE(&event->group_node);
}
}
>
>> Which objects does the head keep?
>
> Sorry, I'm not sure how to answer that. Did the above clarify?
>
> If not, could you rephrase the question?
>
> Thanks,
> Mark.
>
Powered by blists - more mailing lists