linux-kernel - Re: [PATCH] ocfs2: fix stale extent map cache during COW operations

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <20251009142917.517229-1-kartikey406@gmail.com>
Date: Thu,  9 Oct 2025 19:59:16 +0530
From: Deepanshu Kartikey <kartikey406@...il.com>
To: joseph.qi@...ux.alibaba.com,
	mark@...heh.com,
	jlbec@...lplan.org
Cc: ocfs2-devel@...ts.linux.dev,
	linux-kernel@...r.kernel.org,
	syzbot+6fdd8fa3380730a4b22c@...kaller.appspotmail.com
Subject: Re: [PATCH] ocfs2: fix stale extent map cache during COW operations


Hi Joseph,

Thank you for the review. You are absolutely right - the cache clearing at the end of ocfs2_refcount_cow_hunk() should handle the COW path correctly.

After further investigation with the syzbot reproducer and extensive debugging, I found the real issue is in the FITRIM/move_extents code path. The bug occurs when:

1. copy_file_range() creates a reflinked extent with flags=0x2 (OCFS2_EXT_REFCOUNTED)
2. ioctl(FITRIM) is called, which triggers ocfs2_move_extents()
3. In __ocfs2_move_extents_range(), the while loop:
   - Calls ocfs2_get_clusters() which reads extent with flags=0x2 and caches it
   - Then calls ocfs2_move_extent() or ocfs2_defrag_extent()
   - Both eventually call __ocfs2_move_extent() which contains:
       replace_rec.e_flags = ext_flags & ~OCFS2_EXT_REFCOUNTED;
   - This clears the refcount flag and writes to disk with flags=0x0
4. However, the extent map cache is NOT cleared after the move operation
5. Cache still contains stale flags=0x2 while disk has flags=0x0
6. Later, when write() triggers COW, ocfs2_refcount_cal_cow_clusters() reads:
   - From cache: flags=0x2 (stale)
   - From disk extent tree: flags=0x0 (correct)
7. The mismatch triggers: BUG_ON(!(rec->e_flags & OCFS2_EXT_REFCOUNTED))

The proper fix should be in __ocfs2_move_extents_range() to clear the extent cache after each move/defrag operation completes. I will send a v2 patch with this fix.

Thanks,
Deepanshu