[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <489ecb9e.28cc.18bd650affa.Coremail.00107082@163.com>
Date: Thu, 16 Nov 2023 12:08:14 +0800 (CST)
From: "David Wang" <00107082@....com>
To: "Namhyung Kim" <namhyung@...nel.org>
Cc: "Peter Zijlstra" <peterz@...radead.org>, mingo@...hat.com,
acme@...nel.org, mark.rutland@....com,
alexander.shishkin@...ux.intel.com, jolsa@...nel.org,
irogers@...gle.com, adrian.hunter@...el.com,
linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re:Re: [Regression or Fix]perf: profiling stats sigificantly
changed for aio_write/read(ext4) between 6.7.0-rc1 and 6.6.0
At 2023-11-16 00:26:06, "Namhyung Kim" <namhyung@...nel.org> wrote:
>On Wed, Nov 15, 2023 at 8:12 AM David Wang <00107082@....com> wrote:
>>
>>
>> 在 2023-11-15 23:48:33,"Namhyung Kim" <namhyung@...nel.org> 写道:
>> >On Wed, Nov 15, 2023 at 3:00 AM David Wang <00107082@....com> wrote:
>> >>
>> >>
>> >>
>> >> At 2023-11-15 18:32:41, "Peter Zijlstra" <peterz@...radead.org> wrote:
>> >> >
>> >> >Namhyung, could you please take a look, you know how to operate this
>> >> >cgroup stuff.
>> >> >
>> >>
>> >> More information, I run the profiling with 8cpu machine on a SSD with ext4 filesystem :
>> >>
>> >> # mkdir /sys/fs/cgroup/mytest
>> >> # echo $$ > /sys/fs/cgroup/mytest/cgroup.procs
>> >> ## Start profiling targeting cgroup /sys/fs/cgroup/mytest on another terminal
>> >> # fio --randrepeat=1 --ioengine=libaio --direct=1 --name=test --bs=4k --iodepth=64 --size=1G --readwrite=randrw --runtime=600 --numjobs=4 --time_based=1
>> >>
>> >> I got a feeling that f06cc667f7990 would decrease total samples by 10%~20% when profiling IO benchmark within cgroup.
>
>Then what is your profiling tool? Where did you see
>the 10%~20% drop in samples?
>
I wrote a simple/raw tool just for profiling callchains, which use perf_event_open with following attr:
attr.type = PERF_TYPE_SOFTWARE;
attr.config = PERF_COUNT_SW_CPU_CLOCK;
attr.sample_freq = 777; // adjust it
attr.freq = 1;
attr.wakeup_events = 16;
attr.sample_type = PERF_SAMPLE_TID|PERF_SAMPLE_CALLCHAIN;
attr.sample_max_stack = 32;
The source code could be found here: https://github.com/zq-david-wang/linux-tools/tree/main/perf/profiler
>>
>> I am not experienced with the perf-tool at all, too complicated a tool for me.... But I think I can try it.
>
>I feel sorry about that. In most cases, just `perf record -a` and
>then `perf report` would work well. :)
>
Thanks for the information, I use following command to profile with perf:
`./perf record -a -e cpu-clock -G mytest`
I have run several round of test, and before each test, the system was rebooted, and perf output is
On 6.7.0-rc1:
$ sudo ./perf record -a -e cpu-clock -G mytest
^C[ perf record: Woken up 527 times to write data ]
[ perf record: Captured and wrote 132.648 MB perf.data (2478745 samples) ]
---reboot
$ sudo ./perf record -a -e cpu-clock -G mytest
^C[ perf record: Woken up 473 times to write data ]
[ perf record: Captured and wrote 119.205 MB perf.data (2226994 samples) ]
On 6.7.0-rc1 with f06cc667f79909e9175460b167c277b7c64d3df0 reverted
$ sudo ./perf record -a -e cpu-clock -G mytest
^C[ perf record: Woken up 567 times to write data ]
[ perf record: Captured and wrote 142.771 MB perf.data (2668224 samples) ]
---reboot
$ sudo ./perf record -a -e cpu-clock -G mytest
^C[ perf record: Woken up 557 times to write data ]
[ perf record: Captured and wrote 140.604 MB perf.data (2627167 samples) ]
I also run with `-F 777`, which is some random number I used in my tool, (just to compare with my tool )
On 6.7.0-rc1
$ sudo ./perf record -a -e cpu-clock -F 777 -G mytest
^C[ perf record: Woken up 93 times to write data ]
[ perf record: Captured and wrote 24.575 MB perf.data (455222 samples) ] ( My tool have only ~359K samples, not stable)
On 6.7.0-rc1 with f06cc667f79909e9175460b167c277b7c64d3df0 reverted
$ sudo ./perf record -a -e cpu-clock -F 777 -G mytest
^C[ perf record: Woken up 98 times to write data ]
[ perf record: Captured and wrote 25.703 MB perf.data (476390 samples) ] (My tool have about ~446K, stable)
From the data I collected, I think two problem could be observed for f06cc667f79909e9175460b167c277b7c64d3df0
1. sample missing.
2. sample unstable, total sample count drift a lot between tests.
Thanks
David
Powered by blists - more mailing lists