linux-ext4 - Re: [PATCH v3] ext4: Make sure BH_New bit is cleared in ->write

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20241114152607.v3bdfpu2sgayztdr@quack3>
Date: Thu, 14 Nov 2024 16:26:07 +0100
From: Jan Kara <jack@...e.cz>
To: Theodore Ts'o <tytso@....edu>
Cc: Jan Kara <jack@...e.cz>, linux-ext4@...r.kernel.org,
	Baolin Liu <liubaolin@...inos.cn>,
	Zhi Long <longzhi@...gfor.com.cn>
Subject: Re: [PATCH v3] ext4: Make sure BH_New bit is cleared in ->write_end
 handler

On Wed 13-11-24 12:55:50, Theodore Ts'o wrote:
> On Fri, Oct 18, 2024 at 04:59:01PM +0200, Jan Kara wrote:
> > Currently we clear BH_New bit in case of error and also in the standard
> > ext4_write_end() handler (in block_commit_write()). However
> > ext4_journalled_write_end() misses this clearing and thus we are leaving
> > stale BH_New bits behind. Generally ext4_block_write_begin() clears
> > these bits before any harm can be done but in case blocksize < pagesize
> > and we hit some error when processing a page with these stale bits,
> > we'll try to zero buffers with these stale BH_New bits and jbd2 will
> > complain (as buffers were not prepared for writing in this transaction).
> > Fix the problem by clearing BH_New bits in ext4_journalled_write_end()
> > and WARN if ext4_block_write_begin() sees stale BH_New bits.
> > 
> > Reported-and-tested-by: Baolin Liu <liubaolin@...inos.cn>
> > Reported-and-tested-by: Zhi Long <longzhi@...gfor.com.cn>
> > Fixes: 3910b513fcdf ("ext4: persist the new uptodate buffers in ext4_journalled_zero_new_buffers")
> > Signed-off-by: Jan Kara <jack@...e.cz>
> 
> This patch is causing quite a lot of regressions:
> 
> ext4/adv: 569 tests, 36 failures, 61 skipped, 6510 seconds
>   Failures: ext4/307 generic/069 generic/079 generic/082 generic/130 
>     generic/131 generic/219 generic/230 generic/231 generic/232 
>     generic/233 generic/234 generic/235 generic/241 generic/244 
>     generic/270 generic/280 generic/355 generic/379 generic/381 
>     generic/382 generic/400 generic/422 generic/464 generic/566 
>     generic/571 generic/572 generic/587 generic/600 generic/601 
>     generic/681 generic/682 generic/691
> 
> This appears to be caused by inline data, so a quick reproducer for
> bisection purposes was:
> 
>    kvm-xfststs -c ext4/inline ext4/307
> 
> Attached below please find the warning which is triggering the
> "_check_dmesg: something found in dmesg" test failure.
> 
> I suspect this should be fairly easy to fix, but I'm going to drop it
> from my tree for now.

Yeah, sure. I didn't test with inline data so I didn't notice. I'll check
what's going wrong and sorry for the annoyance.

									Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR