linux-ext4 - Re: [PATCH v3] ext4: Make sure BH_New bit is cleared in ->write

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20241113175550.GA462442@mit.edu>
Date: Wed, 13 Nov 2024 12:55:50 -0500
From: "Theodore Ts'o" <tytso@....edu>
To: Jan Kara <jack@...e.cz>
Cc: linux-ext4@...r.kernel.org, Baolin Liu <liubaolin@...inos.cn>,
        Zhi Long <longzhi@...gfor.com.cn>
Subject: Re: [PATCH v3] ext4: Make sure BH_New bit is cleared in ->write_end
 handler

On Fri, Oct 18, 2024 at 04:59:01PM +0200, Jan Kara wrote:
> Currently we clear BH_New bit in case of error and also in the standard
> ext4_write_end() handler (in block_commit_write()). However
> ext4_journalled_write_end() misses this clearing and thus we are leaving
> stale BH_New bits behind. Generally ext4_block_write_begin() clears
> these bits before any harm can be done but in case blocksize < pagesize
> and we hit some error when processing a page with these stale bits,
> we'll try to zero buffers with these stale BH_New bits and jbd2 will
> complain (as buffers were not prepared for writing in this transaction).
> Fix the problem by clearing BH_New bits in ext4_journalled_write_end()
> and WARN if ext4_block_write_begin() sees stale BH_New bits.
> 
> Reported-and-tested-by: Baolin Liu <liubaolin@...inos.cn>
> Reported-and-tested-by: Zhi Long <longzhi@...gfor.com.cn>
> Fixes: 3910b513fcdf ("ext4: persist the new uptodate buffers in ext4_journalled_zero_new_buffers")
> Signed-off-by: Jan Kara <jack@...e.cz>

This patch is causing quite a lot of regressions:

ext4/adv: 569 tests, 36 failures, 61 skipped, 6510 seconds
  Failures: ext4/307 generic/069 generic/079 generic/082 generic/130 
    generic/131 generic/219 generic/230 generic/231 generic/232 
    generic/233 generic/234 generic/235 generic/241 generic/244 
    generic/270 generic/280 generic/355 generic/379 generic/381 
    generic/382 generic/400 generic/422 generic/464 generic/566 
    generic/571 generic/572 generic/587 generic/600 generic/601 
    generic/681 generic/682 generic/691

This appears to be caused by inline data, so a quick reproducer for
bisection purposes was:

   kvm-xfststs -c ext4/inline ext4/307

Attached below please find the warning which is triggering the
"_check_dmesg: something found in dmesg" test failure.

I suspect this should be fairly easy to fix, but I'm going to drop it
from my tree for now.

					- Ted

WARNING: CPU: 0 PID: 2473 at fs/ext4/inode.c:1059 ext4_block_write_begin+0x2f9/0x470
CPU: 0 UID: 0 PID: 2473 Comm: md5sum Not tainted 6.12.0-rc4-xfstests-00028-g9af1fa3b363b #437
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:ext4_block_write_begin+0x2f9/0x470
Code: 40 01 89 44 24 34 e9 c4 fe ff ff 49 8b 01 a8 08 0f 84 b9 fe ff ff 48 8b 03 a8 01 0f 85 ae fe ff ff f0 80 0b 01 e9 a5 fe ff ff <0f> 0b f0 80 23 df e9 12 fe ff ff 0f b7 10 66 81 e2 00 f0 66 81 fa
RSP: 0018:ffffc90003187cd8 EFLAGS: 00010202
RAX: 0000000000000133 RBX: ffff888007c70270 RCX: 0000000000000000
RDX: 0000000000000002 RSI: 0000000000001000 RDI: ffff888007d39c88
RBP: 0000000000001000 R08: ffffffff81466c60 R09: ffffea00003b5940
R10: 0000000000000000 R11: 00000000000000c0 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000001000 R15: 0000000000000000
FS:  00007f337bc40740(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000562a05dff538 CR3: 000000000c9ba005 CR4: 0000000000770ef0
PKRU: 55555554
Call Trace:
 <TASK>
 ? show_trace_log_lvl+0x1a5/0x2e0
 ? show_trace_log_lvl+0x1a5/0x2e0
 ? ext4_da_write_begin+0x153/0x270
 ? ext4_block_write_begin+0x2f9/0x470
 ? __warn.cold+0x8e/0xe8
 ? ext4_block_write_begin+0x2f9/0x470
 ? report_bug+0x123/0x190
 ? handle_bug+0x53/0x90
 ? exc_invalid_op+0x17/0x70
 ? asm_exc_invalid_op+0x1a/0x20
 ? __pfx_ext4_da_get_block_prep+0x10/0x10
 ? ext4_block_write_begin+0x2f9/0x470
 ? ext4_block_write_begin+0x77/0x470
 ? __pfx_ext4_da_get_block_prep+0x10/0x10
 ext4_da_write_begin+0x153/0x270
 generic_perform_write+0xd7/0x290
 ext4_buffered_write_iter+0x64/0x100
 vfs_write+0x277/0x450
 ksys_write+0x6d/0xf0
 do_syscall_64+0x4b/0x110
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f337bd3b240
Code: 40 00 48 8b 15 c1 9b 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 80 3d a1 23 0e 00 00 74 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 48 83 ec 28 48 89
RSP: 002b:00007fff53588b08 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000043 RCX: 00007f337bd3b240
RDX: 0000000000000043 RSI: 0000562a05df54d0 RDI: 0000000000000001
RBP: 0000562a05df54d0 R08: 00007f337bddd4e0 R09: 0000000000000078
R10: 00007fff53588a86 R11: 0000000000000202 R12: 0000000000000043
R13: 00007f337be16760 R14: 0000000000000043 R15: 00007f337be119e0
 </TASK>
---[ end trace 0000000000000000 ]---