[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87v8zx8zia.mognet@arm.com>
Date: Thu, 09 Dec 2021 17:22:05 +0000
From: Valentin Schneider <valentin.schneider@....com>
To: Josef Bacik <josef@...icpanda.com>
Cc: peterz@...radead.org, vincent.guittot@...aro.org,
torvalds@...ux-foundation.org, linux-kernel@...r.kernel.org,
linux-btrfs@...r.kernel.org
Subject: Re: [REGRESSION] 5-10% increase in IO latencies with nohz balance patch
On 06/12/21 09:48, Valentin Schneider wrote:
> On 03/12/21 14:00, Josef Bacik wrote:
>> On Fri, Dec 03, 2021 at 12:03:27PM +0000, Valentin Schneider wrote:
>>> Could you give the 4 top patches, i.e. those above
>>> 8c92606ab810 ("sched/cpuacct: Make user/system times in cpuacct.stat more precise")
>>> a try?
>>>
>>> https://git.gitlab.arm.com/linux-arm/linux-vs.git -b mainline/sched/nohz-next-update-regression
>>>
>>> I gave that a quick test on the platform that caused me to write the patch
>>> you bisected and looks like it didn't break the original fix. If the above
>>> counter-measures aren't sufficient, I'll have to go poke at your
>>> reproducers...
>>>
>>
>> It's better but still around 6% regression. If I compare these patches to the
>> average of the last few days worth of runs you're 5% better than before, so
>> progress but not completely erased.
>>
>
> Hmph, time for me to reproduce this locally then. Thanks!
I carved out a partition out of an Ampere eMAG's HDD to play with BTRFS
via fsperf; this is what I get for the bisected commit (baseline is
bisected patchset's immediate parent, aka v5.15-rc4) via a handful of
./fsperf -p before-regression -c btrfs -n 100 -t emptyfiles500k
write_clat_ns_p99 195395.92 198790.46 4797.01 1.74%
write_iops 17305.79 17471.57 250.66 0.96%
write_clat_ns_p99 195395.92 197694.06 4797.01 1.18%
write_iops 17305.79 17533.62 250.66 1.32%
write_clat_ns_p99 195395.92 197903.67 4797.01 1.28%
write_iops 17305.79 17519.71 250.66 1.24%
If I compare against tip/sched/core however:
write_clat_ns_p99 195395.92 202936.32 4797.01 3.86%
write_iops 17305.79 17065.46 250.66 -1.39%
write_clat_ns_p99 195395.92 204349.44 4797.01 4.58%
write_iops 17305.79 17097.79 250.66 -1.20%
write_clat_ns_p99 195395.92 204169.05 4797.01 4.49%
write_iops 17305.79 17112.29 250.66 -1.12%
tip/sched/core + my patches:
write_clat_ns_p99 195395.92 205721.60 4797.01 5.28%
write_iops 17305.79 16947.59 250.66 -2.07%
write_clat_ns_p99 195395.92 203358.04 4797.01 4.07%
write_iops 17305.79 16953.24 250.66 -2.04%
write_clat_ns_p99 195395.92 201830.40 4797.01 3.29%
write_iops 17305.79 17041.18 250.66 -1.53%
So tip/sched/core seems to have a much worse regression, and my patches
are making things worse on that system...
I've started a bisection to see where the above leads me, unfortunately
this machine needs more babysitting than I thought so it's gonna take a
while.
@Josef any chance you could see if the above also applies to you? tip lives
at https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git, though from
where my bisection is taking me it looks like you should see that against
Linus' tree as well.
Thanks,
Valentin
Powered by blists - more mailing lists