lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160620185231.GA8411@kernel.org>
Date:	Mon, 20 Jun 2016 15:52:31 -0300
From:	Arnaldo Carvalho de Melo <acme@...nel.org>
To:	David Ahern <dsahern@...il.com>
Cc:	Arnaldo Carvalho de Melo <arnaldo.melo@...il.com>,
	Alexei Starovoitov <alexei.starovoitov@...il.com>,
	Arnaldo Carvalho de Melo <acme@...hat.com>,
	"Wangnan (F)" <wangnan0@...wei.com>, linux-kernel@...r.kernel.org,
	pi3orama@....com, Namhyung Kim <namhyung@...il.com>,
	Alexei Starovoitov <ast@...nel.org>,
	Jiri Olsa <jolsa@...nel.org>
Subject: Re: [PATCH 2/2] perf record: Add --dry-run option to check cmdline
 options

Em Mon, Jun 20, 2016 at 12:16:55PM -0600, David Ahern escreveu:
> On 6/20/16 12:13 PM, Arnaldo Carvalho de Melo wrote:
> > 'perf cc' seems sensible, and has the added bonus of being one letter
> > shorter :-)
 
> perf is now a general front-end to a compiler?

Well, it is for quite a while already, what we're talking about here is
to have this:

  # cat filter.c 
  #include <uapi/linux/bpf.h>
  #define SEC(NAME) __attribute__((section(NAME), used))

  SEC("func=hrtimer_nanosleep rqtp->tv_nsec")
  int func(void *ctx, int err, long nsec)
  {
	return nsec > 1000;
  }
  char _license[] SEC("license") = "GPL";
  int _version SEC("version") = LINUX_VERSION_CODE;
  # perf trace -e nanosleep --event filter.c usleep 1
     0.063 ( 0.063 ms): usleep/8041 nanosleep(rqtp: 0x7fff62bead80) = 0
  # perf trace -e nanosleep --event filter.c usleep 2
     0.008 ( 0.008 ms): usleep/8325 nanosleep(rqtp: 0x7ffc2afdf3b0) ...
     0.008 (         ): perf_bpf_probe:func:(ffffffff811137d0) tv_nsec=2000)
     0.070 ( 0.070 ms): usleep/8325  ... [continued]: nanosleep()) = 0
  # 

To not cal the clang compiler under the hood all the time, i.e.
pre-building the .o file that will then be used when present.

What Wang did was to make that possible by adding this to ~/.perfconfig:

  # cat ~/.perfconfig 
  [llvm]
	dump-obj = true
  # 

This way, when we run we get:

  # trace -e nanosleep --event filter.c usleep 6
  LLVM: dumpping filter.o
     0.008 ( 0.008 ms): usleep/9189 nanosleep(rqtp: 0x7fff97a704d0                                        ) ...
     0.008 (         ): perf_bpf_probe:func:(ffffffff811137d0) tv_nsec=6000)
     0.070 ( 0.070 ms): usleep/9189  ... [continued]: nanosleep()) = 0
  #
  # file filter.o
  filter.o: ELF 64-bit LSB relocatable, no machine, version 1 (SYSV), not stripped
  # readelf -SW filter.o
  There are 7 section headers, starting at offset 0x148:

  Section Headers:
    [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
    [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
    [ 1] .strtab           STRTAB          0000000000000000 0000e8 00005a 00      0   0  1
    [ 2] .text             PROGBITS        0000000000000000 000040 000000 00  AX  0   0  4
    [ 3] func=hrtimer_nanosleep rqtp->tv_nsec PROGBITS        0000000000000000 000040 000028 00  AX  0   0  8
    [ 4] license           PROGBITS        0000000000000000 000068 000004 00  WA  0   0  1
    [ 5] version           PROGBITS        0000000000000000 00006c 000004 00  WA  0   0  4
    [ 6] .symtab           SYMTAB          0000000000000000 000070 000078 18      1   2  8
  Key to Flags:
    W (write), A (alloc), X (execute), M (merge), S (strings)
    I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
    O (extra OS processing required) o (OS specific), p (processor specific)
  #

Generating this .o file explicitely and then, when found and somehow checked
that it matches what is in filter.c, shortcircuit the process bypassing the
clang call and using filter.o directly.

This will remove the need for having clang in embedded systems, for instance,
and will speed up using eBPF scripts with perf.

- Arnaldo

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ