lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 14 Jun 2017 07:44:26 -0300
From:   Arnaldo Carvalho de Melo <acme@...nel.org>
To:     Ingo Molnar <mingo@...nel.org>
Cc:     linux-kernel@...r.kernel.org,
        Arnaldo Carvalho de Melo <acme@...hat.com>,
        Adrian Hunter <adrian.hunter@...el.com>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        David Ahern <dsahern@...il.com>,
        Namhyung Kim <namhyung@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Wang Nan <wangnan0@...wei.com>
Subject: Re: [PATCH 1/2] perf evsel: Fix probing of precise_ip level for
 default cycles event

Em Wed, Jun 14, 2017 at 07:45:01AM +0200, Ingo Molnar escreveu:
> * Arnaldo Carvalho de Melo <acme@...nel.org> wrote:
> > +++ b/tools/perf/util/evsel.c
> > @@ -273,6 +273,11 @@ struct perf_evsel *perf_evsel__new_cycles(void)
> >  	event_attr_init(&attr);
> > +	/*
> > +	 * Unnamed union member, not supported as struct member named
> > +	 * initializer in older compilers such as gcc 4.4.7
> > +	 */
> > +	attr.sample_period = 1;
  
> >  	perf_event_attr__set_max_precise_ip(&attr);
 
> Hm, so this really broke perf for me on my main system - 'perf top' and 'perf 
> report' only shows:
 
>  triton:~/tip> perf report --stdio
>  unwind: target platform=x86 is not supported
>  unwind: target platform=x86 is not supported
>  unwind: target platform=x86 is not supported
>  # To display the perf.data header info, please use --header/--header-only options.
>  # Total Lost Samples: 0

>  # Samples: 921K of event 'cycles:ppp'
>  # Event count (approx.): 921045

>  # Overhead  Command    Shared Object     Symbol              
>  # ........  .........  ................  ....................
>     99.93%  hackbench  [kernel.vmlinux]  [k] native_write_msr
>      0.07%  perf       [kernel.vmlinux]  [k] native_write_msr
 
> the bisection result is unambiguous:
 
>    7fd1d092b4337831d7ccbf3a74c07cb0b2089023 is the first bad commit
 
> proper output would be:
 
>  ...
>  # Total Lost Samples: 0
>  #
>  # Samples: 9K of event 'cycles'
>  # Event count (approx.): 4378583062
>  #
>  # Overhead  Command    Shared Object     Symbol                                 
>  # ........  .........  ................  .......................................
>      4.32%  hackbench  [kernel.vmlinux]  [k] copy_user_enhanced_fast_string
>      4.02%  hackbench  [kernel.vmlinux]  [k] unix_stream_read_generic
>      3.75%  hackbench  [kernel.vmlinux]  [k] filemap_map_pages
>      3.06%  hackbench  [kernel.vmlinux]  [k] __check_object_size
>      2.44%  hackbench  [kernel.vmlinux]  [k] _raw_spin_lock_irqsave
>      2.32%  hackbench  [kernel.vmlinux]  [k] native_queued_spin_lock_slowpath
>      2.22%  hackbench  [kernel.vmlinux]  [k] entry_SYSENTER_compat
>      1.90%  hackbench  [vdso]            [.] __vdso_gettimeofday
>      1.80%  hackbench  [kernel.vmlinux]  [k] _raw_spin_lock
>      1.80%  hackbench  [kernel.vmlinux]  [k] skb_set_owner_w
>      1.67%  hackbench  [kernel.vmlinux]  [k] kmem_cache_free
>      1.52%  hackbench  [kernel.vmlinux]  [k] skb_release_data
>      1.48%  hackbench  [kernel.vmlinux]  [k] common_file_perm
>      1.45%  hackbench  [kernel.vmlinux]  [k] page_fault
>      1.45%  hackbench  [kernel.vmlinux]  [k] cmpxchg_double_slab.isra.62
>      1.42%  hackbench  [kernel.vmlinux]  [k] new_sync_read
>      1.36%  hackbench  [kernel.vmlinux]  [k] __check_heap_object
> 
> Here's the hardware details:

No need for that, its simpler, brown paper bag, I concentrated on it not
returning -EINVAL and didn't test it enough, ditto with the other guys
that tested this on s/390 :-\

The default case gets the precise_ip right, i.e. 3 in this broadwell machine,
but remained with the sample_period=1 used to probe it, ugh.

[root@...et ~]# perf record usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.021 MB perf.data (69 samples) ]
[root@...et ~]# perf evlist -v
cycles:ppp: size: 112, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME, disabled: 1, inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1


If we use 'cycles:P' explicitely we can see that it uses the default sample_freq:

[root@...et ~]# perf record -e cycles:P usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.018 MB perf.data (8 samples) ]
[root@...et ~]# perf evlist -v
cycles:P: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, mmap2: 1, comm_exec: 1
[root@...et ~]#

Can you try with the patch below, which is hackish but the minimal fix at this
point. Later I'll come up with a way of returning a fully configured cycles
evsel, which will make the tools code simpler, moving more stuff to the
libraries.

I'll look into the unwind case, where SRCARCH is being passed to the unwind
code uses both x86_64, not x86 (probably uses i386 or x86_32 for the 32-bit case).

- Arnaldo

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index a7ce529ca36c..cda44b0e821c 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -276,10 +276,17 @@ struct perf_evsel *perf_evsel__new_cycles(void)
 	/*
 	 * Unnamed union member, not supported as struct member named
 	 * initializer in older compilers such as gcc 4.4.7
+	 *
+	 * Just for probing the precise_ip:
 	 */
 	attr.sample_period = 1;
 
 	perf_event_attr__set_max_precise_ip(&attr);
+	/*
+	 * Now let the usual logic to set up the perf_event_attr defaults
+	 * to kick in when we return and before perf_evsel__open() is called.
+	 */
+	attr.sample_period = 0;
 
 	evsel = perf_evsel__new(&attr);
 	if (evsel == NULL)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ