linux-kernel - Re: [RFCv2 00/48] perf tools: Add threads to record command

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180924142927.GA22809@krava>
Date:   Mon, 24 Sep 2018 16:29:27 +0200
From:   Jiri Olsa <jolsa@...hat.com>
To:     Alexey Budankov <alexey.budankov@...ux.intel.com>
Cc:     Jiri Olsa <jolsa@...nel.org>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        lkml <linux-kernel@...r.kernel.org>,
        Ingo Molnar <mingo@...nel.org>,
        Namhyung Kim <namhyung@...nel.org>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Peter Zijlstra <a.p.zijlstra@...llo.nl>,
        Andi Kleen <andi@...stfloor.org>
Subject: Re: [RFCv2 00/48] perf tools: Add threads to record command

On Mon, Sep 24, 2018 at 04:09:09PM +0300, Alexey Budankov wrote:
> Hi,
> 
> On 24.09.2018 10:02, Alexey Budankov wrote:
> > Hi,
> > 
> > On 23.09.2018 22:30, Jiri Olsa wrote:
> >> On Fri, Sep 21, 2018 at 09:13:08AM +0300, Alexey Budankov wrote:
> >>
> >> SNIP
> >>
> >>> Events:
> >>> cpu/period=P,event=0x3c/Duk;CPU_CLK_UNHALTED.THREAD
> >>> cpu/period=P,umask=0x3/Duk;CPU_CLK_UNHALTED.REF_TSC
> >>> cpu/period=P,event=0xc0/Duk;INST_RETIRED.ANY
> >>> cpu/period=0xaae61,event=0xc2,umask=0x10/uk;UOPS_RETIRED.ALL
> >>> cpu/period=0x11171,event=0xc2,umask=0x20/uk;UOPS_RETIRED.SCALAR_SIMD
> >>> cpu/period=0x11171,event=0xc2,umask=0x40/uk;UOPS_RETIRED.PACKED_SIMD
> >>>
> >>> =================================================
> >>>
> >>> Command:
> >>> /usr/bin/time /tmp/vtune_amplifier_2019.574715/bin64/perf.thr record --threads=T \
> >>> 	-a -N -B -T -R --call-graph dwarf,1024 --user-regs=ip,bp,sp \
> >>>         -e cpu/period=P,event=0x3c/Duk,\
> >>>            cpu/period=P,umask=0x3/Duk,\
> >>>            cpu/period=P,event=0xc0/Duk,\
> >>>            cpu/period=0x30d40,event=0xc2,umask=0x10/uk,\
> >>>            cpu/period=0x4e20,event=0xc2,umask=0x20/uk,\
> >>>            cpu/period=0x4e20,event=0xc2,umask=0x40/uk \
> >>>          --clockid=monotonic_raw -- ./matrix.(icc|gcc)
> >>
> >> hum, so I guess the results suck because of the -a option,
> >> getting extra samples for all the perf record threads
> >>
> >> could you try without the -a? you monitor only user events,
> >> so you're interested only in ./matrix.* samples, right?
> > 
> > Ok, trying without -a, in per-process mode. 
> 
> Command:
> 
> /usr/bin/time ./perf.thr record --threads=T \
> 	-N -B -T -R --call-graph dwarf,1024 --user-regs=ip,bp,sp \
> 	-e cpu/period=P,event=0x3c/Duk,\
> 	   cpu/period=P,umask=0x3/Duk,\
> 	   cpu/period=P,event=0xc0/Duk,\
> 	   cpu/period=0xaae61,event=0xc2,umask=0x10/uk,\
> 	   cpu/period=0x11171,event=0xc2,umask=0x20/uk,\
> 	   cpu/period=0x11171,event=0xc2,umask=0x40/uk \
> 	--clockid=monotonic_raw -- ./matrix.gcc
> 
> Workload: matrix multiplication in 128 threads
> 
> T : 272
> 	P (period, ms)       : 0.35 
> 	runtime overhead (%) : 13x ~ 87.73 / 6.81

how do you meassure this?

> 	data loss (%)        : 0
> 	LOST events          : 36
> 	SAMPLE events        : 8048542
>         perf.data size (GiB) : 10

any idea why does it have some much more samples?

thanks,
jirka