[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <53A1EC2E.1010706@gmail.com>
Date: Wed, 18 Jun 2014 13:44:46 -0600
From: David Ahern <dsahern@...il.com>
To: Jiri Olsa <jolsa@...nel.org>, linux-kernel@...r.kernel.org
CC: Arnaldo Carvalho de Melo <acme@...nel.org>,
Corey Ashford <cjashfor@...ux.vnet.ibm.com>,
Frederic Weisbecker <fweisbec@...il.com>,
Ingo Molnar <mingo@...nel.org>,
Jean Pihet <jean.pihet@...aro.org>,
Namhyung Kim <namhyung@...nel.org>,
Paul Mackerras <paulus@...ba.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>
Subject: Re: [PATCHv2 00/18] perf tools: Factor ordered samples queue
On 6/18/14, 8:58 AM, Jiri Olsa wrote:
> hi,
> this patchset factors session's ordered samples queue,
> and allows to limit the size of this queue.
>
> v2 changes:
> - several small changes for review comments (Namhyung)
>
>
> The report command queues events till any of following
> conditions is reached:
> - PERF_RECORD_FINISHED_ROUND event is processed
> - end of the file is reached
>
> Any of above conditions will force the queue to flush some
> events while keeping all allocated memory for next events.
>
> If PERF_RECORD_FINISHED_ROUND is missing the queue will
> allocate memory for every single event in the perf.data.
> This could lead to enormous memory consuption and speed
> degradation of report command for huge perf.data files.
>
> With the quue allocation limit of 100 MB, I've got around
> 15% speedup on reporting of ~10GB perf.data file.
>
> current code:
> Performance counter stats for './perf.old report --stdio -i perf-test.data' (3 runs):
>
> 621,685,704,665 cycles ( +- 0.52% )
> 873,397,467,969 instructions ( +- 0.00% )
>
> 286.133268732 seconds time elapsed ( +- 1.13% )
>
> with patches:
> Performance counter stats for './perf report --stdio -i perf-test.data' (3 runs):
>
> 603,933,987,185 cycles ( +- 0.45% )
> 869,139,445,070 instructions ( +- 0.00% )
>
> 245.337510637 seconds time elapsed ( +- 0.49% )
>
>
> The speed up seems to be mainly in less cycles spent in servicing
> page faults:
>
> current code:
> 4.44% 0.01% perf.old [kernel.kallsyms] [k] page_fault
>
> with patches:
> 1.45% 0.00% perf [kernel.kallsyms] [k] page_fault
>
> current code (faults event):
> 6,643,807 faults ( +- 0.36% )
>
> with patches (faults event):
> 2,214,756 faults ( +- 3.03% )
>
>
> Also now we have one of our big memory spender under control
> and the ordered events queue code is put in separated object
> with clear interface ready to be used by another command
> like script.
>
> Also reachable in here:
> git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
> perf/core_ordered_events
>
I've skimmed through the patches. What happens if you are in the middle
of a round and the max queue size is reached?
I need to find some time for a detailed review, and to run through some
stress case scenarios. e.g., a couple that come to mind
perf sched record -- perf bench sched pipe
perf kvm record while booting a nested VM which causes a lot of VMEXITs
David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists