linux-kernel - Re: [PATCH] perf inject: Flush ordered events on FINISHED

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20201006023949.GA1682192@google.com>
Date:   Tue, 6 Oct 2020 11:39:49 +0900
From:   namhyung@...nel.org
To:     Jiri Olsa <jolsa@...hat.com>
Cc:     Arnaldo Carvalho de Melo <acme@...nel.org>,
        Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <a.p.zijlstra@...llo.nl>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Stephane Eranian <eranian@...gle.com>,
        Ian Rogers <irogers@...gle.com>,
        Al Grant <al.grant@...s.arm.com>,
        Adrian Hunter <adrian.hunter@...el.com>
Subject: Re: [PATCH] perf inject: Flush ordered events on FINISHED_ROUND

> > On Fri, Oct 02, 2020 at 10:03:17PM +0900, Namhyung Kim wrote:
> > > Below measures time and memory usage during the perf inject and
> > > report using ~190MB data file.
> > >
> > > Before:
> > >   perf inject:  11.09 s,  382148 KB
> > >   perf report:   8.05 s,  397440 KB
> > >
> > > After:
> > >   perf inject:  16.24 s,   83376 KB
> > >   perf report:   7.96 s,  216184 KB
> > >
> > > As you can see, it used 2x memory of the input size.  I guess it's
> > > because it needs to keep the copy for the whole input.  But I don't
> > > understand why processing time of perf inject increased..

Measuring it with time shows:

           before       after
  real    11.309s     17.040s
  user     8.084s     13.940s
  sys      6.535s      6.732s

So it's user space to make the difference.  I've run perf record on
both (with cycles:U) and the dominant function is same: queue_event.
(46.98% vs 65.87%)

It seems the flushing the queue makes more overhead on sorting.

Thanks
Namhyung