[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4u2l4huoj7zsfy2u37lgdzlmwwdntgqaer7wta7ud3kat7ox2n@oxhbcqryre3r>
Date: Wed, 28 Jan 2026 11:22:30 +0100
From: Jan Kara <jack@...e.cz>
To: Gerald Yang <gerald.yang@...onical.com>
Cc: tytso@....edu, adilger.kernel@...ger.ca, jack@...e.cz,
linux-ext4@...r.kernel.org, gerald.yang.tw@...il.com
Subject: Re: [PATCH] ext4: Fix call trace when remounting to read only in
data=journal mode
On Wed 28-01-26 15:45:15, Gerald Yang wrote:
> When remounting the filesystem to read only in data=journal mode
> it may dump the following call trace:
>
> [ 71.629350] CPU: 0 UID: 0 PID: 177 Comm: kworker/u96:5 Tainted: G E 6.19.0-rc7 #1 PREEMPT(voluntary)
> [ 71.629352] Tainted: [E]=UNSIGNED_MODULE
> [ 71.629353] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009)/LXD, BIOS unknown 2/2/2022
> [ 71.629354] Workqueue: writeback wb_workfn (flush-7:4)
> [ 71.629359] RIP: 0010:ext4_journal_check_start+0x8b/0xd0
> [ 71.629360] Code: 31 ff 45 31 c0 45 31 c9 e9 42 ad c4 00 48 8b 5d f8 b8 fb ff ff ff c9 31 d2 31 c9 31 f6 31 ff 45 31 c0 45 31 c9 c3 cc cc cc cc <0f> 0b b8 e2 ff ff ff eb c2 0f 0b eb
> a9 44 8b 42 08 68 c7 53 ce b8
> [ 71.629361] RSP: 0018:ffffcf32c0fdf6a8 EFLAGS: 00010202
> [ 71.629364] RAX: ffff8f08c8505000 RBX: ffff8f08c67ee800 RCX: 0000000000000000
> [ 71.629366] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
> [ 71.629367] RBP: ffffcf32c0fdf6b0 R08: 0000000000000001 R09: 0000000000000000
> [ 71.629368] R10: ffff8f08db18b3a8 R11: 0000000000000000 R12: 0000000000000000
> [ 71.629368] R13: 0000000000000002 R14: 0000000000000a48 R15: ffff8f08c67ee800
> [ 71.629369] FS: 0000000000000000(0000) GS:ffff8f0a7d273000(0000) knlGS:0000000000000000
> [ 71.629370] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 71.629371] CR2: 00007b66825905cc CR3: 000000011053d004 CR4: 0000000000772ef0
> [ 71.629374] PKRU: 55555554
> [ 71.629374] Call Trace:
> [ 71.629378] <TASK>
> [ 71.629382] __ext4_journal_start_sb+0x38/0x1c0
> [ 71.629383] mpage_prepare_extent_to_map+0x4af/0x580
> [ 71.629389] ? sbitmap_get+0x73/0x180
> [ 71.629399] ext4_do_writepages+0x3cc/0x10a0
> [ 71.629400] ? kvm_sched_clock_read+0x11/0x20
> [ 71.629409] ext4_writepages+0xc8/0x1b0
> [ 71.629410] ? ext4_writepages+0xc8/0x1b0
> [ 71.629411] do_writepages+0xc4/0x180
> [ 71.629416] __writeback_single_inode+0x45/0x350
> [ 71.629419] ? _raw_spin_unlock+0xe/0x40
> [ 71.629423] writeback_sb_inodes+0x260/0x5c0
> [ 71.629425] ? __schedule+0x4d1/0x1870
> [ 71.629429] __writeback_inodes_wb+0x54/0x100
> [ 71.629431] ? queue_io+0x82/0x140
> [ 71.629433] wb_writeback+0x1ab/0x330
> [ 71.629448] wb_workfn+0x31d/0x410
> [ 71.629450] process_one_work+0x191/0x3e0
> [ 71.629455] worker_thread+0x2e3/0x420
>
> This issue can be easily reproduced by:
> mkdir -p mnt
> dd if=/dev/zero of=ext4disk bs=1G count=2 oflag=direct
> mkfs.ext4 ext4disk
> tune2fs -o journal_data ext4disk
> mount ext4disk mnt
> fio --name=fiotest --rw=randwrite --bs=4k --runtime=3 --ioengine=libaio --iodepth=128 --numjobs=4 --filename=mnt/fiotest --filesize=1G --group_reporting
> mount -o remount,ro ext4disk mnt
> sync
>
> In data=journal mode, metadata and data are both written to the journal
> first, but for the second write, ext4 relies on the writeback thread to
> flush the data to the real file location.
>
> After the filesystem is remounted to read only, writeback thread still
> writes data to it and causes the issue. Return early to avoid starting
> a journal transaction on a read only filesystem, once the filesystem
> becomes writable again, the write thread will continue writing data.
>
> Signed-off-by: Gerald Yang <gerald.yang@...onical.com>
Thanks for the report and the patch! I can indeed reproduce this warning.
But the patch itself is certainly not the right fix for this problem.
ext4_remount() must make sure there are no dirty pages on the filesystem
anymore when remounting filesystem read only and it apparently fails to do
so. In particular it calls sync_filesystem() which should make sure all
data is written. So this bug needs more investigation why there are some
dirty pages left in the inode in data=journal mode because
ext4_writepages() should have written them all...
Honza
> ---
> fs/ext4/inode.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 15ba4d42982f..4e3bbf17995e 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -2787,6 +2787,17 @@ static int ext4_do_writepages(struct mpage_da_data *mpd)
> if (unlikely(ret))
> goto out_writepages;
>
> + /*
> + * For data=journal, if the filesystem was remounted read-only,
> + * the writeback thread may still write dirty pages to it.
> + * Return early to avoid starting a journal transaction on a
> + * read-only filesystem.
> + */
> + if (ext4_should_journal_data(inode) && sb_rdonly(inode->i_sb)) {
> + ret = -EROFS;
> + goto out_writepages;
> + }
> +
> /*
> * If we have inline data and arrive here, it means that
> * we will soon create the block for the 1st page, so
> --
> 2.43.0
>
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists