lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAK-xaQa2NP0kfwQZoko-FUsSCbW31F1S48SJy8+94aSs7PCd3w@mail.gmail.com>
Date: Fri, 5 Jul 2024 00:15:00 +0200
From: Andrea Gelmini <andrea.gelmini@...il.com>
To: Filipe Manana <fdmanana@...nel.org>
Cc: Mikhail Gavrilov <mikhail.v.gavrilov@...il.com>, 
	Linux List Kernel Mailing <linux-kernel@...r.kernel.org>, 
	Linux regressions mailing list <regressions@...ts.linux.dev>, Btrfs BTRFS <linux-btrfs@...r.kernel.org>, 
	dsterba@...e.com, josef@...icpanda.com
Subject: Re: 6.10/regression/bisected - after f1d97e769152 I spotted increased
 execution time of the kswapd0 process and symptoms as if there is not enough memory

Il giorno gio 4 lug 2024 alle ore 19:25 Filipe Manana
<fdmanana@...nel.org> ha scritto:
> 2) In some cases we get very large negative numbers for the number of
> extent maps to scan.
>     This shouldn't happen and either our own btrfs counter might have
> overflowed or some other bug,

Well, I was thinking about my specific odds, and I tried this:
a) kernel 6.6.36;
b) on spare partition nvme created a new shiny btrfs;
c) then mount it forcing compression;
d) multiple parallel cp of kernel and libreoffice src;
e) reboot with same rc6+branch already used;
f) tar of the new btrfs: no problem at all;
g) let it finish;
h) tar of /.snapshots: PSI memory skyrocket, and usual slowdown reading;
i) stop it;
l) again tar of the new btrfs: no problem
m) repeat a few times.

You can see the output here:
https://asciinema.org/a/rJpGWvXYH6IDBXWYhtJckkKWo

In the end you see I kill tar and let the PSI going down to zero, if
you are interested.

> Ok, so maybe I missed it, but I haven't kswapd0 in there, or nothing
> taking 100% CPU.
> Maybe it was just Mikhail running into that?

To have this effect and the extreme luggish response (I mean, click
something and it takes more than 30 seconds to react)
I need to work at least one day on my laptop. At this point also
cycling to virtual desktop takes a lot.

Thinking about my different use case:
a) i always suspend. I just reboot when change kernel. So, I can work
for weeks with same kernel. Suspend2RAM, not disk, btw;
b) months ago I let run beesd for a day.

> So I'm surprised that you get an unresponsive desktop.
Same point as before. In this case is not so luggish, but - i.e. - if
I click for screenlock it doesn't start immediately, it waits for a
little bit more than one second.

> Interestingly, here the memory PSI stays at 0% or very close to that,
> it never reaches anything close to the 60%.

You see the same thing with the last test with new btrfs partition.
New partition: ~0%
/.snapshots/: near 60%.


> With htop in parallel, the bpftrace script, and since my htop version
> doesn't show PSI information (probably an older version than yours), I
> kept monitoring PSI like this:

Well, mine is taken from here:
https://github.com/htop-dev/htop.git
Compiled with:
./configure --enable-capabilities --enable-delayacct --enable-sensors
--enable-werror   --enable-affinity
And tweaked config file. If you want I can send it.


> So several different things to try here:

I stop here for the moment. I have to sleep.
In the weekend I do the rest and reply to you!

> Thanks a lot to you and Mikhail, not just for the reporting but also
> to apply patches, compile a kernel, run the tests and do all those
> valuable observations which are all very time consuming.

My little contribution to free software!

Ciao,
Gelma

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ