[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20240815092141.1223238-1-chizhiling@163.com>
Date: Thu, 15 Aug 2024 17:21:41 +0800
From: Chi Zhiling <chizhiling@....com>
To: mark@...heh.com,
jlbec@...lplan.org,
joseph.qi@...ux.alibaba.com
Cc: ocfs2-devel@...ts.linux.dev,
linux-kernel@...r.kernel.org,
starzhangzsd@...il.com,
Chi Zhiling <chizhiling@...inos.cn>,
Shida Zhang <zhangshida@...inos.cn>
Subject: [PATCH] ocfs2: fix unexpected zeroing of virtual disk
From: Chi Zhiling <chizhiling@...inos.cn>
In a guest virtual machine, we found that there is unexpected data
zeroing problem detected occassionly:
XFS (vdb): Mounting V5 Filesystem
XFS (vdb): Ending clean mount
XFS (vdb): Metadata CRC error detected at xfs_refcountbt_read_verify+0x2c/0xf0, xfs_refcountbt block 0x200028
XFS (vdb): Unmount and run xfs_repair
XFS (vdb): First 128 bytes of corrupted metadata buffer:
00000000e0cd2f5e: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000000cafd57f5: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000000d0298d7d: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000000f0698484: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000000adb789a7: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
000000005292b878: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000000885b4700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00000000fd4b4df7: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
XFS (vdb): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0x200028 len 8 error 74
XFS (vdb): Error -117 recovering leftover CoW allocations.
XFS (vdb): xfs_do_force_shutdown(0x8) called from line 994 of file fs/xfs/xfs_mount.c. Return address = 000000003a53523a
XFS (vdb): Corruption of in-memory data detected. Shutting down filesystem
XFS (vdb): Please umount the filesystem and rectify the problem(s)
It turns out that the root cause is from the physical host machine.
More specifically, it is caused by the ocfs2.
when the page_size is 64k, the block should advance by 16 each time
instead of 1.
This will lead to a wrong mapping from the page to the disk, which
will zero some adjacent part of the disk.
Suggested-by: Shida Zhang <zhangshida@...inos.cn>
Signed-off-by: Chi Zhiling <chizhiling@...inos.cn>
---
fs/ocfs2/aops.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
index d6c985cc6353..1fea43c33b6b 100644
--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/aops.c
@@ -1187,7 +1187,7 @@ static int ocfs2_write_cluster(struct address_space *mapping,
/* This is the direct io target page. */
if (wc->w_pages[i] == NULL) {
- p_blkno++;
+ p_blkno += (1 << (PAGE_SHIFT - inode->i_sb->s_blocksize_bits));
continue;
}
--
2.27.0
Powered by blists - more mailing lists