linux-kernel - Re: [PATCH] perf scripts python: Add a script to run instances of perf script in parallel

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <3a92ebdb-8923-46af-a020-0e12233262a9@intel.com>
Date: Mon, 11 Mar 2024 19:52:01 +0200
From: Adrian Hunter <adrian.hunter@...el.com>
To: Andi Kleen <ak@...ux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@...nel.org>, Jiri Olsa <jolsa@...nel.org>,
 Namhyung Kim <namhyung@...nel.org>, Ian Rogers <irogers@...gle.com>,
 linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org
Subject: Re: [PATCH] perf scripts python: Add a script to run instances of
 perf script in parallel

On 11/03/24 18:13, Andi Kleen wrote:
> On Sun, Mar 10, 2024 at 09:35:02PM +0200, Adrian Hunter wrote:
>> Add a Python script to run a perf script command multiple times in
>> parallel, using perf script options --cpu and --time so that each job
>> processes a different chunk of the data.
>>
>> Refer to the script's own help text at the end of the patch for more
>> details.
>>
>> The script is useful for Intel PT traces, that can be efficiently
>> decoded by perf script when split by CPU and/or time ranges. Running
>> jobs in parallel can decrease the overall decoding time.
> 
> This only optimizes for the run time of the decoder. Often when you do
> analysis you have a non trivial part of it in some analysis script too,
> but you currently have no directi / easy way to paralelize that. It would 
> be better to support parallel pipelines.

It will parallelize any scripts and / or dlfilters that perf script
itself executes.

> 
> TBH I'm not sure the script is worth it. If you need to do parallel
> pipelines (which imho is the common case) it's probably better to just
> write a custom shell script, which is not that difficult.

It can be a pain to figure out how best to split the data if it is not
evenly distributed.

The script also has value as a reference or starting point for
users.

>                                                           It might be
> better to have a helper that makes writing such scripts easier, 
> e.g. figuring out reasonable options for manual parallelization
> based on the input file. I think parts of your script do that, maybe
> it is usable for that.

The --dry-run option shows the perf script commands, but an option
to pipe through another command could be added.

> 
> Also as a default output it would be better to just merge the 
> original output in order and output it on stdout.

That assumes that the output comes from perf script printf
output and not a perf script _script_.

If the data is split by CPU, it will not be in time order
if it is simply concatenated back together.

> 
> You should probably limit the number of jobs to some minimum
> length, otherwise on systems with many CPUs there might be
> inefficiently short jobs.

That happens for Intel PT (64 PSB minimum), but could be added
for the normal case also.