lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170620134442.GG28157@leverpostej>
Date:   Tue, 20 Jun 2017 14:44:44 +0100
From:   Mark Rutland <mark.rutland@....com>
To:     Alexey Budankov <alexey.budankov@...ux.intel.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Andi Kleen <ak@...ux.intel.com>,
        Kan Liang <kan.liang@...el.com>,
        Dmitri Prokhorov <Dmitry.Prohorov@...el.com>,
        Valery Cherepennikov <valery.cherepennikov@...el.com>,
        David Carrillo-Cisneros <davidcc@...gle.com>,
        Stephane Eranian <eranian@...gle.com>,
        linux-kernel@...r.kernel.org
Subject: Re: [RESEND PATCH v3 1/n] perf/core: addressing 4x slowdown during
 per-process profiling of STREAM benchmark on Intel Xeon Phi

On Fri, Jun 16, 2017 at 02:03:58AM +0300, Alexey Budankov wrote:
> perf/core: use rb trees for pinned/flexible groups
> 
> By default, the userspace perf tool opens per-cpu task-bound events
> when sampling, so for N logical events requested by the user, the tool
> will open N * NR_CPUS events.
> 
> In the kernel, we mux events with a hrtimer, periodically rotating the
> flexible group list and trying to schedule each group in turn. We
> skip groups whose cpu filter doesn't match. So when we get unlucky,
> we can walk N * (NR_CPUS - 1) groups pointlessly for each hrtimer
> invocation.
> 
> This has been observed to result in significant overhead when running
> the STREAM benchmark on 272 core Xeon Phi systems.
> 
> One way to avoid this is to place our events into an rb tree sorted by
> CPU filter, so that our hrtimer can skip to the current CPU's
> list and ignore everything else.
> 
> As a step towards that, this patch replaces event group lists with rb
> trees.
> 
> Signed-off-by: Alexey Budankov <alexey.budankov@...ux.intel.com>
> ---
>  include/linux/perf_event.h |  18 ++-
>  kernel/events/core.c       | 393
> +++++++++++++++++++++++++++++++++------------
>  2 files changed, 307 insertions(+), 104 deletions(-)
> 
> Addressed Mark Rutland's comments from the previous patch version.

... then this should be v4, no?

Which comments? Could you pelase write a changelog in future?

In future, please send patches as a series, with a known upper-bound
rather than N. It's really painful to find them when they're sent
separately, without a known upper bound.

[...]

> +/*
> + * Delete a group from a tree. If the group is directly attached to
> the tree
> + * it also detaches all groups on the group's group_list list.
> + */
> +static void
> +perf_cpu_tree_delete(struct rb_root *tree, struct perf_event *event)
> +{
> +	WARN_ON_ONCE(!tree || !event);
> +
> +	if (RB_EMPTY_NODE(&event->group_node)) {
> +		list_del_init(&event->group_entry);
> +	} else {
> +		struct perf_event *list_event, *list_next;
> +
> +		rb_erase(&event->group_node, tree);
> +		list_for_each_entry_safe(list_event, list_next,
> +				&event->group_list, group_entry)
> +			list_del_init(&list_event->group_entry);
> +	}
> +}

As I commented on the last version, this means that all groups which
were (incidentally) hanging off of this one are removed, and can
no longer be reached via the tree.

Surely one of the remaining groups should be added to the tree?

I don't beleive that is correct.

I beleive it would be simpler to reason about a threaded rb-tree here,
since that special case wouldn't exist.

Thanks,
Mark.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ