linux-kernel - Re: [Regression or Fix]perf: profiling stats sigificantly changed for aio_write/read(ext4) between 6.7.0-rc1 and 6.6.0

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1da1b7f.564.18be01bd6ce.Coremail.00107082@163.com>
Date:   Sat, 18 Nov 2023 09:46:42 +0800 (CST)
From:   "David Wang" <00107082@....com>
To:     "Namhyung Kim" <namhyung@...nel.org>
Cc:     "Peter Zijlstra" <peterz@...radead.org>, mingo@...hat.com,
        acme@...nel.org, mark.rutland@....com,
        alexander.shishkin@...ux.intel.com, jolsa@...nel.org,
        irogers@...gle.com, adrian.hunter@...el.com,
        linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [Regression or Fix]perf: profiling stats sigificantly changed
 for aio_write/read(ext4) between 6.7.0-rc1 and 6.6.0


At 2023-11-18 05:11:02, "Namhyung Kim" <namhyung@...nel.org> wrote:
>On Wed, Nov 15, 2023 at 8:09 PM David Wang <00107082@....com> wrote:
>>

>>
>>
>> From the data I collected, I think two problem could be observed for f06cc667f79909e9175460b167c277b7c64d3df0
>> 1. sample missing.
>> 2. sample unstable, total sample count drift a lot between tests.
>
>Hmm.. so the fio process was running in the background during
>the profiling, right?  But I'm not sure how you measured the same
>amount of time.  Probably you need to run this (for 10 seconds):
>
>  sudo perf record -a -G mytest -- sleep 10
>
>And I guess you don't run the perf command in the target cgroup
>which is good.
>

Yes  profiling process was not in the target cgroup.
I use  fio with `fio --randrepeat=1 --ioengine=libaio --direct=1 --name=test  --bs=4k --iodepth=64 --size=1G --readwrite=randrw  --runtime=600 --numjobs=4 --time_based=1` which would run 600 seconds.
There would be drifts in the profiling report between runs,  from those small  samples of test data I collected, maybe not enough to make a firm conclusion,  I feel when the commit is reverted, the expectation for total sample count is higher and the standard deviation is smaller.

>And is there any chance if it's improved because of the change?
>Are the numbers in 6.7 better or worse?
>
I have no idea whether the change of expected total sample count a bug or a fix,  but,  the observed result that total sample count drift a lot (bigger standard deviation), I think ,  is a bad thing. 
 

Thanks
David Wang