[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAK-xaQaesuU-TjDQcXgbjoNbZa0Y2qLHtSu5efy99EUDVnuhUg@mail.gmail.com>
Date: Sat, 6 Jul 2024 02:11:50 +0200
From: Andrea Gelmini <andrea.gelmini@...il.com>
To: Filipe Manana <fdmanana@...nel.org>
Cc: Mikhail Gavrilov <mikhail.v.gavrilov@...il.com>,
Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
Linux regressions mailing list <regressions@...ts.linux.dev>, Btrfs BTRFS <linux-btrfs@...r.kernel.org>,
dsterba@...e.com, josef@...icpanda.com
Subject: Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased
execution time of the kswapd0 process and symptoms as if there is not enough memory
Il giorno gio 4 lug 2024 alle ore 19:25 Filipe Manana
<fdmanana@...nel.org> ha scritto:
> 2) Then drop that patch that disables the shrinker.
> With all the previous 4 patches applied, apply this one on top of them:
>
> https://gist.githubusercontent.com/fdmanana/9cea16ca56594f8c7e20b67dc66c6c94/raw/557bd5f6b37b65d210218f8da8987b74bfe5e515/gistfile1.txt
>
> The goal here is to see if the extent map eviction done by the
> shrinker is making reads from other tasks too slow, and check if
> that's what0s making your system unresponsive.
>
> 3) Then drop the patch from step 2), and on top of the previous 4
> patches from my git tree, apply this one:
>
> https://gist.githubusercontent.com/fdmanana/a7c9c2abb69c978cf5b80c2f784243d5/raw/b4cca964904d3ec15c74e36ccf111a3a2f530520/gistfile1.txt
>
> This is just to confirm if we do have concurrent calls to the
> shrinker, as the tracing seems to suggest, and where the negative
> numbers come from.
> It also helps to check if not allowing concurrent calls to it, by
> skipping if it's already running, helps making the problems go away.
Uhm... good news...
To recap, here's this evening tests:
Kernel 6.6.36:
Fresh BTRFS: (tar cp . | pv -ta > /dev/null): 0:03:53 [ 231MiB/s]
(time and average speed)
Aged snapshots: (tar cp /.snapshots/|pv -at -s 100G -S >
/dev/null): 0:02:20 [ 726MiB/s]
Kernel rc6+branch+2nd patch:
Fresh BTRFS: 0:03:14 [ 278MiB/s]
Aged snapshots: I had to stop. PSI memory > 80%. Processes stucked
for most time. i.e.: mplayer via nfs stops every few seconds for a
while, switching virtual desktop takes >5 seconds. Also "echo 3 >
drop_caches" takes more than 5 minutes to finish (on the other two
kernels, it was quite immediate).
Kernel rc6+branch+3rd patch:
Fresh BTRFS: 0:03:40 [ 245MiB/s]
Aged snapshots: 0:02:03 [ 826MiB/s]
N.b.: no skyrocket PSI memory, no swap pressure, no sluggish results!!!
Now, that was just one run, I'm going to use this patch for a few
days. Next week I can tell you for sure if everything is right!
For the moment it seems we have a winner!
Powered by blists - more mailing lists