[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3227b88a-5607-596c-3ade-74e0b21988e6@linux.intel.com>
Date: Mon, 24 Sep 2018 22:12:15 +0300
From: Alexey Budankov <alexey.budankov@...ux.intel.com>
To: Jiri Olsa <jolsa@...hat.com>
Cc: Jiri Olsa <jolsa@...nel.org>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
lkml <linux-kernel@...r.kernel.org>,
Ingo Molnar <mingo@...nel.org>,
Namhyung Kim <namhyung@...nel.org>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Andi Kleen <andi@...stfloor.org>
Subject: Re: [RFCv2 00/48] perf tools: Add threads to record command
Hi,
On 24.09.2018 21:32, Alexey Budankov wrote:
> Hi,
>
> On 24.09.2018 17:29, Jiri Olsa wrote:
>> On Mon, Sep 24, 2018 at 04:09:09PM +0300, Alexey Budankov wrote:
>>> Hi,
>>>
>>> On 24.09.2018 10:02, Alexey Budankov wrote:
>>>> Hi,
>>>>
>>>> On 23.09.2018 22:30, Jiri Olsa wrote:
>>>>> On Fri, Sep 21, 2018 at 09:13:08AM +0300, Alexey Budankov wrote:
>>>>>
>>>>> SNIP
>>>>>
>>>>>> Events:
>>>>>> cpu/period=P,event=0x3c/Duk;CPU_CLK_UNHALTED.THREAD
>>>>>> cpu/period=P,umask=0x3/Duk;CPU_CLK_UNHALTED.REF_TSC
>>>>>> cpu/period=P,event=0xc0/Duk;INST_RETIRED.ANY
>>>>>> cpu/period=0xaae61,event=0xc2,umask=0x10/uk;UOPS_RETIRED.ALL
>>>>>> cpu/period=0x11171,event=0xc2,umask=0x20/uk;UOPS_RETIRED.SCALAR_SIMD
>>>>>> cpu/period=0x11171,event=0xc2,umask=0x40/uk;UOPS_RETIRED.PACKED_SIMD
>>>>>>
>>>>>> =================================================
>>>>>>
>>>>>> Command:
>>>>>> /usr/bin/time /tmp/vtune_amplifier_2019.574715/bin64/perf.thr record --threads=T \
>>>>>> -a -N -B -T -R --call-graph dwarf,1024 --user-regs=ip,bp,sp \
>>>>>> -e cpu/period=P,event=0x3c/Duk,\
>>>>>> cpu/period=P,umask=0x3/Duk,\
>>>>>> cpu/period=P,event=0xc0/Duk,\
>>>>>> cpu/period=0x30d40,event=0xc2,umask=0x10/uk,\
>>>>>> cpu/period=0x4e20,event=0xc2,umask=0x20/uk,\
>>>>>> cpu/period=0x4e20,event=0xc2,umask=0x40/uk \
>>>>>> --clockid=monotonic_raw -- ./matrix.(icc|gcc)
>>>>>
>>>>> hum, so I guess the results suck because of the -a option,
>>>>> getting extra samples for all the perf record threads
>>>>>
>>>>> could you try without the -a? you monitor only user events,
>>>>> so you're interested only in ./matrix.* samples, right?
>>>>
>>>> Ok, trying without -a, in per-process mode.
>>>
>>> Command:
>>>
>>> /usr/bin/time ./perf.thr record --threads=T \
>>> -N -B -T -R --call-graph dwarf,1024 --user-regs=ip,bp,sp \
>>> -e cpu/period=P,event=0x3c/Duk,\
>>> cpu/period=P,umask=0x3/Duk,\
>>> cpu/period=P,event=0xc0/Duk,\
>>> cpu/period=0xaae61,event=0xc2,umask=0x10/uk,\
>>> cpu/period=0x11171,event=0xc2,umask=0x20/uk,\
>>> cpu/period=0x11171,event=0xc2,umask=0x40/uk \
>>> --clockid=monotonic_raw -- ./matrix.gcc
>>>
>>> Workload: matrix multiplication in 128 threads
>>>
>>> T : 272
>>> P (period, ms) : 0.35
>>> runtime overhead (%) : 13x ~ 87.73 / 6.81
>>
>> how do you meassure this?
>
> This is the ratio of elapsed times:
> runtime overhead (%) : elapsed_time_under_profiling / elapsed_time
> i.e.
>
> /usr/bin/time ./matrix.gcc
> ...
> 767.03user 11.17system 0:06.81elapsed 11424%CPU (0avgtext+0avgdata 100756maxresident)k
> 88inputs+0outputs (0major+139898minor)pagefaults 0swaps
>
> so elapsed_time = 6.81 sec
>
> elapsed_time_uder_profiling is elapsed value from output of
>
> /usr/bin/time ./perf.thr record --threads=T ...
>
>>
>>> data loss (%) : 0
>>> LOST events : 36
>>> SAMPLE events : 8048542
>>> perf.data size (GiB) : 10
>>
>> any idea why does it have some much more samples?
>
> Presumably, this is because period is 350us and this is the smallest
> one that perf.thr manages to capture data without data loss (=0) when T=272.
> However, during collection, I get message that max sampling frequency
> is lowered to 3KHz.
Lowering default frequency rate to 3000.
Please consider tweaking /proc/sys/kernel/perf_event_max_sample_rate.
Thanks,
Alexey
>
> Thanks,
> Alexey
>
>>
>> thanks,
>> jirka
>>
>
Powered by blists - more mailing lists