lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Fri, 1 May 2020 15:15:54 -0700
From:   Ian Rogers <irogers@...gle.com>
To:     Jiri Olsa <jolsa@...hat.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Mark Rutland <mark.rutland@....com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Namhyung Kim <namhyung@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>,
        Stephane Eranian <eranian@...gle.com>
Subject: Re: [PATCH 2/2] lib kallsyms: parse using io api

On Fri, May 1, 2020 at 8:28 AM Ian Rogers <irogers@...gle.com> wrote:
>
> On Fri, May 1, 2020 at 5:23 AM Jiri Olsa <jolsa@...hat.com> wrote:
> >
> > On Thu, Apr 30, 2020 at 12:35:57PM -0700, Ian Rogers wrote:
> > > Perf record will call kallsyms__parse 4 times during startup and process
> > > megabytes of data. This changes kallsyms__parse to use the io library
> > > rather than fgets to improve performance of the user code by over 8%.
> > >
> > > Before:
> > >   Running 'internals/kallsyms-parse' benchmark:
> > >   Average kallsyms__parse took: 103.988 ms (+- 0.203 ms)
> > > After:
> > >   Running 'internals/kallsyms-parse' benchmark:
> > >   Average kallsyms__parse took: 95.571 ms (+- 0.006 ms)
> > >
> > > For a workload like:
> > > $ perf record /bin/true
> > > Run under 'perf record -e cycles:u -g' the time goes from:
> > > Before
> > > 30.10%     1.67%  perf     perf                [.] kallsyms__parse
> > > After
> > > 25.55%    20.04%  perf     perf                [.] kallsyms__parse
> > > So a little under 5% of the start-up time is removed. A lot of what
> > > remains is on the kernel side, but caching kallsyms within perf would
> > > at least impact memory footprint.
> >
> > with your change I'm getting following warnings:
> >
> > $ sudo ./perf record -a
> > Couldn't record kernel reference relocation symbol
> > Symbol resolution may be skewed if relocation was used (e.g. kexec).
> > Check /proc/kallsyms permission or run as root.
>
> I'll investigate, sorry in advance for sending this out too early.

It was an issue with the result in the case of success, it should be
the last result of the function pointer and not 0. Sorry about not
spotting that. Fixed in:
https://lore.kernel.org/lkml/20200501221315.54715-1-irogers@google.com/T/#md59bc11419ca5adc8ab29fa3460288329e8dca91

Thanks,
Ian

> > >
> > > Signed-off-by: Ian Rogers <irogers@...gle.com>
> > > ---
> > >  tools/lib/api/io.h          |  3 ++
> > >  tools/lib/symbol/kallsyms.c | 81 +++++++++++++++++++------------------
> > >  2 files changed, 44 insertions(+), 40 deletions(-)
> > >
> > > diff --git a/tools/lib/api/io.h b/tools/lib/api/io.h
> > > index b7e55b5f8a4a..777c20f6b604 100644
> > > --- a/tools/lib/api/io.h
> > > +++ b/tools/lib/api/io.h
> > > @@ -7,6 +7,9 @@
> > >  #ifndef __API_IO__
> > >  #define __API_IO__
> > >
> > > +#include <stdlib.h>
> > > +#include <unistd.h>
> >
> > was this missing?
>
> Yes, they were getting picked up by a transitive #include in
> synthesize-events.c, but given the call to read and use of size_t are
> here it makes sense for the #includes to be here.
>
> Thanks,
> Ian
>
> > jirka
> >
> > > +
> > >  struct io {
> > >       /* File descriptor being read/ */
> > >       int fd;
> >
> > SNIP
> >

Powered by blists - more mailing lists