linux-ext4 - Re: [PATCH] ext4: Fix call trace when remounting to read only in data=journal mode

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4u2l4huoj7zsfy2u37lgdzlmwwdntgqaer7wta7ud3kat7ox2n@oxhbcqryre3r>
Date: Wed, 28 Jan 2026 11:22:30 +0100
From: Jan Kara <jack@...e.cz>
To: Gerald Yang <gerald.yang@...onical.com>
Cc: tytso@....edu, adilger.kernel@...ger.ca, jack@...e.cz, 
	linux-ext4@...r.kernel.org, gerald.yang.tw@...il.com
Subject: Re: [PATCH] ext4: Fix call trace when remounting to read only in
 data=journal mode

On Wed 28-01-26 15:45:15, Gerald Yang wrote:
> When remounting the filesystem to read only in data=journal mode
> it may dump the following call trace:
> 
> [   71.629350] CPU: 0 UID: 0 PID: 177 Comm: kworker/u96:5 Tainted: G            E       6.19.0-rc7 #1 PREEMPT(voluntary)
> [   71.629352] Tainted: [E]=UNSIGNED_MODULE
> [   71.629353] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009)/LXD, BIOS unknown 2/2/2022
> [   71.629354] Workqueue: writeback wb_workfn (flush-7:4)
> [   71.629359] RIP: 0010:ext4_journal_check_start+0x8b/0xd0
> [   71.629360] Code: 31 ff 45 31 c0 45 31 c9 e9 42 ad c4 00 48 8b 5d f8 b8 fb ff ff ff c9 31 d2 31 c9 31 f6 31 ff 45 31 c0 45 31 c9 c3 cc cc cc cc <0f> 0b b8 e2 ff ff ff eb c2 0f 0b eb
>  a9 44 8b 42 08 68 c7 53 ce b8
> [   71.629361] RSP: 0018:ffffcf32c0fdf6a8 EFLAGS: 00010202
> [   71.629364] RAX: ffff8f08c8505000 RBX: ffff8f08c67ee800 RCX: 0000000000000000
> [   71.629366] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> [   71.629367] RBP: ffffcf32c0fdf6b0 R08: 0000000000000001 R09: 0000000000000000
> [   71.629368] R10: ffff8f08db18b3a8 R11: 0000000000000000 R12: 0000000000000000
> [   71.629368] R13: 0000000000000002 R14: 0000000000000a48 R15: ffff8f08c67ee800
> [   71.629369] FS:  0000000000000000(0000) GS:ffff8f0a7d273000(0000) knlGS:0000000000000000
> [   71.629370] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   71.629371] CR2: 00007b66825905cc CR3: 000000011053d004 CR4: 0000000000772ef0
> [   71.629374] PKRU: 55555554
> [   71.629374] Call Trace:
> [   71.629378]  <TASK>
> [   71.629382]  __ext4_journal_start_sb+0x38/0x1c0
> [   71.629383]  mpage_prepare_extent_to_map+0x4af/0x580
> [   71.629389]  ? sbitmap_get+0x73/0x180
> [   71.629399]  ext4_do_writepages+0x3cc/0x10a0
> [   71.629400]  ? kvm_sched_clock_read+0x11/0x20
> [   71.629409]  ext4_writepages+0xc8/0x1b0
> [   71.629410]  ? ext4_writepages+0xc8/0x1b0
> [   71.629411]  do_writepages+0xc4/0x180
> [   71.629416]  __writeback_single_inode+0x45/0x350
> [   71.629419]  ? _raw_spin_unlock+0xe/0x40
> [   71.629423]  writeback_sb_inodes+0x260/0x5c0
> [   71.629425]  ? __schedule+0x4d1/0x1870
> [   71.629429]  __writeback_inodes_wb+0x54/0x100
> [   71.629431]  ? queue_io+0x82/0x140
> [   71.629433]  wb_writeback+0x1ab/0x330
> [   71.629448]  wb_workfn+0x31d/0x410
> [   71.629450]  process_one_work+0x191/0x3e0
> [   71.629455]  worker_thread+0x2e3/0x420
> 
> This issue can be easily reproduced by:
> mkdir -p mnt
> dd if=/dev/zero of=ext4disk bs=1G count=2 oflag=direct
> mkfs.ext4 ext4disk
> tune2fs -o journal_data ext4disk
> mount ext4disk mnt
> fio --name=fiotest --rw=randwrite --bs=4k --runtime=3 --ioengine=libaio --iodepth=128 --numjobs=4 --filename=mnt/fiotest --filesize=1G --group_reporting
> mount -o remount,ro ext4disk mnt
> sync
> 
> In data=journal mode, metadata and data are both written to the journal
> first, but for the second write, ext4 relies on the writeback thread to
> flush the data to the real file location.
> 
> After the filesystem is remounted to read only, writeback thread still
> writes data to it and causes the issue. Return early to avoid starting
> a journal transaction on a read only filesystem, once the filesystem
> becomes writable again, the write thread will continue writing data.
> 
> Signed-off-by: Gerald Yang <gerald.yang@...onical.com>

Thanks for the report and the patch! I can indeed reproduce this warning.
But the patch itself is certainly not the right fix for this problem.
ext4_remount() must make sure there are no dirty pages on the filesystem
anymore when remounting filesystem read only and it apparently fails to do
so. In particular it calls sync_filesystem() which should make sure all
data is written. So this bug needs more investigation why there are some
dirty pages left in the inode in data=journal mode because
ext4_writepages() should have written them all...

								Honza

> ---
>  fs/ext4/inode.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 15ba4d42982f..4e3bbf17995e 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -2787,6 +2787,17 @@ static int ext4_do_writepages(struct mpage_da_data *mpd)
>  	if (unlikely(ret))
>  		goto out_writepages;
>  
> +	/*
> +	 * For data=journal, if the filesystem was remounted read-only,
> +	 * the writeback thread may still write dirty pages to it.
> +	 * Return early to avoid starting a journal transaction on a
> +	 * read-only filesystem.
> +	 */
> +	if (ext4_should_journal_data(inode) && sb_rdonly(inode->i_sb)) {
> +		ret = -EROFS;
> +		goto out_writepages;
> +	}
> +
>  	/*
>  	 * If we have inline data and arrive here, it means that
>  	 * we will soon create the block for the 1st page, so
> -- 
> 2.43.0
> 
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR