lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 5 Dec 2022 23:50:48 +0200
From:   Ivan Zahariev <famzah@...soft.com>
To:     Theodore Ts'o <tytso@....edu>
Cc:     linux-ext4@...r.kernel.org,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>
Subject: Re: kernel BUG at fs/ext4/inode.c:1914 - page_buffers()

On 5.12.2022 г. 23:10, Theodore Ts'o wrote:
> Is it fair to say that your workload is using data=journaled and is
> frequently truncating that might have been recently modified (hence
> triggering the race between truncate and journalled writepages)?

The servers are hosting hundreds of users who run their own tasks and we 
have no control nor a way to closely observe their usage pattern. Unless 
you point us in a direction to debug this somehow.

"data=journaled" is definitely in place for all servers.

> I wonder if you could come up with a more reliable reproducer so we
> can test a particular patch.

We already tried different parallel combinations of mmap()'ed reading, 
direct and regular write(), drop_caches, sync(), etc. but we can't 
trigger the panic.

If you have any suggestions what we should try next as a reproducer, 
please share and we will try to implement and execute it.

Did I understand correctly that a possible reproducer would be a loop of 
heavy write() followed by truncate() of the same file? Should we 
randomly sync() and/or "echo 3 > /proc/sys/vm/drop_caches" to increase 
the chance of hitting the bug?

Best regards.
--Ivan

Powered by blists - more mailing lists