lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 13 Oct 2017 12:09:37 -0300
From:   Arnaldo Carvalho de Melo <acme@...nel.org>
To:     "Liang, Kan" <kan.liang@...el.com>
Cc:     "peterz@...radead.org" <peterz@...radead.org>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "jolsa@...nel.org" <jolsa@...nel.org>,
        "wangnan0@...wei.com" <wangnan0@...wei.com>,
        "hekuang@...wei.com" <hekuang@...wei.com>,
        "namhyung@...nel.org" <namhyung@...nel.org>,
        "alexander.shishkin@...ux.intel.com" 
        <alexander.shishkin@...ux.intel.com>,
        "Hunter, Adrian" <adrian.hunter@...el.com>,
        "ak@...ux.intel.com" <ak@...ux.intel.com>
Subject: Re: [PATCH 3/4] perf record: event synthesization multithreading
 support

Em Fri, Oct 13, 2017 at 02:58:40PM +0000, Liang, Kan escreveu:
> > Em Fri, Oct 13, 2017 at 07:09:26AM -0700, kan.liang@...el.com escreveu:
> > > From: Kan Liang <Kan.liang@...el.com>
> > >
> > > The process function process_synthesized_event writes the process
> > > result to perf.data, which is not multithreading friendly.
> > >
> > > Realloc buffer for each thread to temporarily keep the processing
> > > result. Write them to the perf.data at the end of event synthesization.
> > > The new method doesn't impact the final result, because
> > >  - The order of the synthesized event is not important.
> > >  - The number of synthesized event is limited. Usually, it only needs
> > >    hundreds of Kilobyte to store all the synthesized event.
> > >    It's unlikly failed because of lack of memory.
> > 
> > Why not write to a per cpu file and then at the end merge them? 
> 
> I just thought merging from memory should be faster than merging from files.
> 
> > Leave
> > the memory management to the kernel, i.e. in most cases you may even not
> > end up touching the disk, when memory is plentiful, just rewind the per
> > event files and go on dumping to the main perf.data file.
 
> Agree. I will do it.
 
> > At some point we may just don't do this merging, and keep per cpu files
> > all the way to perf report, etc. This would be a first foray into
> > that...
 
> Yes, but it's a little bit complex.

Sure thing :-)

I think I handwaved an outline in a previous patch discussion, but
basically we already read and immediately process events from multiple
perf_mmap entries that are being produced by the kernel in 'perf top' we
need to continue doing that, but with a thread for each perf_mmap, and
then probably use the ordered_samples code with some algorithm to pick a
"PERF_RECORD_FINISHED_ROUND" at regular intervals to order the batch and
then feed it to the hists code, something along those lines.

So doing it first for 'perf top' seems better, then for 'record' and
'report' (that is what 'top' is) its just to unplug the live event
processing and instead dump for multiple files, for report then it
becomes the 'perf top' event processing part, i.e. one thread per file +
ordered_samples.

> I think I will do it in a separate improvement patch series later.
> For now, I will do the file merge.

Thanks

- Arnaldo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ