linux-kernel - [PATCH v7 09/12] perf record: implement -z,--compression

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1e360280-fdd3-0555-0daf-d711d4d0941f@linux.intel.com>
Date:   Tue, 12 Mar 2019 08:32:52 +0300
From:   Alexey Budankov <alexey.budankov@...ux.intel.com>
To:     Arnaldo Carvalho de Melo <acme@...nel.org>
Cc:     Jiri Olsa <jolsa@...hat.com>, Namhyung Kim <namhyung@...nel.org>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Andi Kleen <ak@...ux.intel.com>,
        linux-kernel <linux-kernel@...r.kernel.org>
Subject: [PATCH v7 09/12] perf record: implement -z,--compression_level=n
 option


Implemented -z,--compression_level=n option that enables compression
of mmaped kernel data buffers content in runtime during perf record
mode collection.

Compression overhead has been measured for serial and AIO streaming
when profiling matrix multiplication workload:

    -------------------------------------------------------------
    | SERIAL			  | AIO-1                       |
----------------------------------------------------------------|
|-z | OVH(x) | ratio(x) size(MiB) | OVH(x) | ratio(x) size(MiB) |
|---------------------------------------------------------------|
| 0 | 1,00   | 1,000    179,424   | 1,00   | 1,000    187,527   |
| 1 | 1,04   | 8,427    181,148   | 1,01   | 8,474    188,562   |
| 2 | 1,07   | 8,055    186,953   | 1,03   | 7,912    191,773   |
| 3 | 1,04   | 8,283    181,908   | 1,03   | 8,220    191,078   |
| 5 | 1,09   | 8,101    187,705   | 1,05   | 7,780    190,065   |
| 8 | 1,05   | 9,217    179,191   | 1,12   | 6,111    193,024   |
-----------------------------------------------------------------

OVH = (Execution time with -z N) / (Execution time with -z 0)

ratio - compression ratio
size  - number of bytes that was compressed

	size ~= trace size x ratio

Signed-off-by: Alexey Budankov <alexey.budankov@...ux.intel.com>
---
 tools/perf/Documentation/perf-record.txt | 5 +++++
 tools/perf/builtin-record.c              | 6 +++++-
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index d1e6c1fd7387..632502b1f335 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -472,6 +472,11 @@ Also at some cases executing less trace write syscalls with bigger data size can
 shorter than executing more trace write syscalls with smaller data size thus lowering
 runtime profiling overhead.
 
+-z::
+--compression-level=n::
+Produce compressed trace using specified level n (no compression: 0 - default,
+fastest compression: 1, smallest trace: 22)
+
 --all-kernel::
 Configure all used events to run in kernel space.
 
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 47e0abe22192..9f1bba6d4331 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -346,7 +346,7 @@ static void record__aio_mmap_read_sync(struct record *rec)
 	struct perf_evlist *evlist = rec->evlist;
 	struct perf_mmap *maps = evlist->mmap;
 
-	if (!rec->opts.nr_cblocks)
+	if (!record__aio_enabled(rec))
 		return;
 
 	for (i = 0; i < evlist->nr_mmaps; i++) {
@@ -2166,6 +2166,10 @@ static struct option __record_options[] = {
 	OPT_CALLBACK(0, "affinity", &record.opts, "node|cpu",
 		     "Set affinity mask of trace reading thread to NUMA node cpu mask or cpu of processed mmap buffer",
 		     record__parse_affinity),
+#ifdef HAVE_ZSTD_SUPPORT
+	OPT_UINTEGER('z', "compression-level", &record.opts.comp_level,
+		     "Produce compressed trace using specified level (default: 0, fastest: 1, smallest: 22)"),
+#endif
 	OPT_END()
 };
 
-- 
2.20.1