linux-ext4 - Re: kernel BUG at fs/ext4/inode.c:1914

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <6f64edf8-2bed-4131-0042-6a2005ed6926@icdsoft.com>
Date:   Mon, 5 Dec 2022 19:27:16 +0200
From:   Ivan Zahariev <famzah@...soft.com>
To:     linux-ext4@...r.kernel.org
Cc:     Theodore Ts'o <tytso@....edu>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: kernel BUG at fs/ext4/inode.c:1914 - page_buffers()

Hello,

I forgot to mention that the ext4 file system is mounted with 
"data=journal" and the crash happens on servers which have more than 20 
GB RAM and are I/O busy.

> Back to the problem! 99% of the difference between 4.14 and the latest 
> kernel for __ext4_journalled_writepage() in "fs/ext4/inode.c" comes 
> from the following commit: 
> https://github.com/torvalds/linux/commit/5c48a7df91499e371ef725895b2e2d21a126e227
>
> Is it safe that we revert this patch on the latest 5.15 kernel, so 
> that we can confirm if this resolves the issue for us?

If we can't or if it doesn't make sense to revert the patch, is there 
anything else we can do to assist in the debug of this rare kernel crash?

The machines are Qemu/KVM guests but dumping the whole memory would take 
a couple of minutes, so it's not viable.

Are there any debug statements we could add in 
__ext4_journalled_writepage() in "fs/ext4/inode.c" that may give a hint 
where the problem is?

Best regards.
--Ivan