linux-kernel - Re: [PATCH] perf scripts python: Add a script to run instances of perf script in parallel

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Ze8ttn4bxBrYi63h@tassilo>
Date: Mon, 11 Mar 2024 09:13:42 -0700
From: Andi Kleen <ak@...ux.intel.com>
To: Adrian Hunter <adrian.hunter@...el.com>
Cc: Arnaldo Carvalho de Melo <acme@...nel.org>,
	Jiri Olsa <jolsa@...nel.org>, Namhyung Kim <namhyung@...nel.org>,
	Ian Rogers <irogers@...gle.com>, linux-kernel@...r.kernel.org,
	linux-perf-users@...r.kernel.org
Subject: Re: [PATCH] perf scripts python: Add a script to run instances of
 perf script in parallel

On Sun, Mar 10, 2024 at 09:35:02PM +0200, Adrian Hunter wrote:
> Add a Python script to run a perf script command multiple times in
> parallel, using perf script options --cpu and --time so that each job
> processes a different chunk of the data.
> 
> Refer to the script's own help text at the end of the patch for more
> details.
> 
> The script is useful for Intel PT traces, that can be efficiently
> decoded by perf script when split by CPU and/or time ranges. Running
> jobs in parallel can decrease the overall decoding time.

This only optimizes for the run time of the decoder. Often when you do
analysis you have a non trivial part of it in some analysis script too,
but you currently have no directi / easy way to paralelize that. It would 
be better to support parallel pipelines.

TBH I'm not sure the script is worth it. If you need to do parallel
pipelines (which imho is the common case) it's probably better to just
write a custom shell script, which is not that difficult. It might be
better to have a helper that makes writing such scripts easier, 
e.g. figuring out reasonable options for manual parallelization
based on the input file. I think parts of your script do that, maybe
it is usable for that.

Also as a default output it would be better to just merge the 
original output in order and output it on stdout.

You should probably limit the number of jobs to some minimum
length, otherwise on systems with many CPUs there might be
inefficiently short jobs.

-Andi