lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 26 Jun 2024 11:48:44 +0100
From: Filipe Manana <fdmanana@...nel.org>
To: Mikhail Gavrilov <mikhail.v.gavrilov@...il.com>
Cc: Linux List Kernel Mailing <linux-kernel@...r.kernel.org>, 
	Linux regressions mailing list <regressions@...ts.linux.dev>, Btrfs BTRFS <linux-btrfs@...r.kernel.org>, 
	dsterba@...e.com, josef@...icpanda.com
Subject: Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased
 execution time of the kswapd0 process and symptoms as if there is not enough memory

On Tue, Jun 25, 2024 at 10:04 PM Mikhail Gavrilov
<mikhail.v.gavrilov@...il.com> wrote:
>
> Hi,
> after f1d97e769152 I spotted increased execution time of the kswapd0
> process and symptoms as if there is not enough memory.
> Very often I see that kswapd0 consumes 100% CPU [1].
> Before f1d97e769152 after an hour kswapd0 is working ~3:51 and after
> three hours ~10:13 time.
> After f1d97e769152 kswapd0 time increased to ~25:48 after the first
> hour and three hours it hit 71:01 time.
> So execution time has increased by 6-7 times.
>
> f1d97e76915285013037c487d9513ab763005286 is the first bad commit
> commit f1d97e76915285013037c487d9513ab763005286 (HEAD)
> Author: Filipe Manana <fdmanana@...e.com>
> Date:   Fri Mar 22 18:02:59 2024 +0000
>
>     btrfs: add a global per cpu counter to track number of used extent maps
>
>     Add a per cpu counter that tracks the total number of extent maps that are
>     in extent trees of inodes that belong to fs trees. This is going to be
>     used in an upcoming change that adds a shrinker for extent maps. Only
>     extent maps for fs trees are considered, because for special trees such as
>     the data relocation tree we don't want to evict their extent maps which
>     are critical for the relocation to work, and since those are limited, it's
>     not a concern to have them in memory during the relocation of a block
>     group. Another case are extent maps for free space cache inodes, which
>     must always remain in memory, but those are limited (there's only one per
>     free space cache inode, which means one per block group).
>
>     Reviewed-by: Josef Bacik <josef@...icpanda.com>
>     Signed-off-by: Filipe Manana <fdmanana@...e.com>
>     Reviewed-by: David Sterba <dsterba@...e.com>
>     Signed-off-by: David Sterba <dsterba@...e.com>
>
>  fs/btrfs/disk-io.c    |  9 +++++++++
>  fs/btrfs/extent_map.c | 17 +++++++++++++++++
>  fs/btrfs/fs.h         |  2 ++
>  3 files changed, 28 insertions(+)
>
> Unfortunately I can't check the revert commit f1d97e769152 because of conflicts.

Yes, because there are follow up commits that depend on it.

I seriously doubt that this is correctly bisected, because that commit
only adds a counter for tracking the number of extent maps.
It's using a per cpu counter and I can't think of anything more
efficient than that.

The commit that adds the extent map shrinker, which is the next commit
(956a17d9d050761e34ae6f2624e9c1ce456de204), that can
explain what you are observing.

Now the one you bisected doesn't make sense, not just because it's
just a counter update but also because you are
only seeing the kswapd0 slowdown, which is what triggers the shrinker.

The shrinker itself can be improved, there's one place where I know it
might loop too much, and I'll improve that.

Thanks.

>
> > git reset --hard v6.10-rc1
> HEAD is now at 1613e604df0c Linux 6.10-rc1
>
> > git revert -n f1d97e76915285013037c487d9513ab763005286
> Auto-merging fs/btrfs/disk-io.c
> Auto-merging fs/btrfs/extent_map.c
> Auto-merging fs/btrfs/fs.h
> CONFLICT (content): Merge conflict in fs/btrfs/fs.h
> error: could not revert f1d97e769152... btrfs: add a global per cpu
> counter to track number of used extent maps
>
> However I double checked every bisect step and I am confident in the
> correctness of the result.
>
> I also attach here a full kernel log and build config.
>
> My hardware specs: https://linux-hardware.org/?probe=d377acdb9e
>
> Filipe can you look into this please?
>
> [1] https://postimg.cc/Xrn6qfxh
>
> --
> Best Regards,
> Mike Gavrilov.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ