lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6f64edf8-2bed-4131-0042-6a2005ed6926@icdsoft.com>
Date:   Mon, 5 Dec 2022 19:27:16 +0200
From:   Ivan Zahariev <famzah@...soft.com>
To:     linux-ext4@...r.kernel.org
Cc:     Theodore Ts'o <tytso@....edu>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: kernel BUG at fs/ext4/inode.c:1914 - page_buffers()

Hello,

I forgot to mention that the ext4 file system is mounted with 
"data=journal" and the crash happens on servers which have more than 20 
GB RAM and are I/O busy.

> Back to the problem! 99% of the difference between 4.14 and the latest 
> kernel for __ext4_journalled_writepage() in "fs/ext4/inode.c" comes 
> from the following commit: 
> https://github.com/torvalds/linux/commit/5c48a7df91499e371ef725895b2e2d21a126e227
>
> Is it safe that we revert this patch on the latest 5.15 kernel, so 
> that we can confirm if this resolves the issue for us?

If we can't or if it doesn't make sense to revert the patch, is there 
anything else we can do to assist in the debug of this rare kernel crash?

The machines are Qemu/KVM guests but dumping the whole memory would take 
a couple of minutes, so it's not viable.

Are there any debug statements we could add in 
__ext4_journalled_writepage() in "fs/ext4/inode.c" that may give a hint 
where the problem is?

Best regards.
--Ivan


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ