linux-kernel - Re: [PATCH V3 0/6] event synthesization multithreading for perf record

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20171024130849.GD7045@kernel.org>
Date:   Tue, 24 Oct 2017 10:08:49 -0300
From:   Arnaldo Carvalho de Melo <acme@...nel.org>
To:     Ingo Molnar <mingo@...nel.org>
Cc:     Jiri Olsa <jolsa@...hat.com>, "Liang, Kan" <kan.liang@...el.com>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "peterz@...radead.org" <peterz@...radead.org>,
        "jolsa@...nel.org" <jolsa@...nel.org>,
        "wangnan0@...wei.com" <wangnan0@...wei.com>,
        "hekuang@...wei.com" <hekuang@...wei.com>,
        "namhyung@...nel.org" <namhyung@...nel.org>,
        "alexander.shishkin@...ux.intel.com" 
        <alexander.shishkin@...ux.intel.com>,
        "Hunter, Adrian" <adrian.hunter@...el.com>,
        "ak@...ux.intel.com" <ak@...ux.intel.com>
Subject: Re: [PATCH V3 0/6] event synthesization multithreading for perf
 record

Em Tue, Oct 24, 2017 at 02:59:44PM +0200, Ingo Molnar escreveu:
> 
> * Jiri Olsa <jolsa@...hat.com> wrote:
> 
> > I recently made some changes on threaded record, which are based
> > on Namhyungs time* API, which is needed to read/sort the data afterwards
> > 
> > but I wasn't able to get any substantial and constant reduce of LOST events
> > and then I got sidetracked and did not finish, but it's in here:
> 
> So, in the context of system-wide profiling, the way that would work best I think 
> is the following:
> 
>   thread #0 binds itself to CPU#0 (via sched_setaffinity) and creates a per-CPU event on CPU#0
>   thread #1 binds itself to CPU#1 (via sched_setaffinity) and creates a per-CPU event on CPU#1
>   thread #2 binds itself to CPU#2 (via sched_setaffinity) and creates a per-CPU event on CPU#2

Right, that is how I think it should be done as well, and those will
just dump on separate files, in a per session directory, with an extra
file for the session details, in what is now the header.

Later, the same thing happens at processing time, this time we'll have
contention to access global thread state, the need for rounds of
PERF_SAMPLE_TIME based ordering, like what we have now in the
tools/perf/util/ordered-events.[ch] code, etc.

This works for both 'report', 'script', 'top', 'trace', etc, as is
basically the model we already have. All the work that was done for
refcounting the thread, map, etc as well as locking those rbtrees would
finally be taken full advantage of.

- Arnaldo
 
> etc.
> 
> Is this how you implemented it?

> If the threads in the thread pool are just free-running then the scheduler might 
> not migrate it to the 'right' CPU that is streaming the perf events and there will 
> be a lot of cross-talking between CPUs.
> 
> Inherited events (default 'perf record') is tougher.
> 
> Thanks,
> 
> 	Ingo