lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aSla8sc66ys6zCGZ@li-dc0c254c-257c-11b2-a85c-98b6c1322444.ibm.com>
Date: Fri, 28 Nov 2025 13:48:58 +0530
From: Ojaswin Mujoo <ojaswin@...ux.ibm.com>
To: Zhang Yi <yi.zhang@...weicloud.com>
Cc: linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        linux-kernel@...r.kernel.org, tytso@....edu, adilger.kernel@...ger.ca,
        jack@...e.cz, yi.zhang@...wei.com, yizhang089@...il.com,
        libaokun1@...wei.com, yangerkun@...wei.com
Subject: Re: [PATCH v2 06/13] ext4: don't cache extent during splitting extent

On Thu, Nov 27, 2025 at 03:01:27PM +0800, Zhang Yi wrote:
> On 11/26/2025 8:04 PM, Ojaswin Mujoo wrote:
> > On Fri, Nov 21, 2025 at 02:08:04PM +0800, Zhang Yi wrote:
> >> From: Zhang Yi <yi.zhang@...wei.com>
> >>
> >> Caching extents during the splitting process is risky, as it may result
> >> in stale extents remaining in the status tree. Moreover, in most cases,
> >> the corresponding extent block entries are likely already cached before
> >> the split happens, making caching here not particularly useful.
> >>
> >> Assume we have an unwritten extent, and then DIO writes the first half.
> >>
> >>   [UUUUUUUUUUUUUUUU] on-disk extent        U: unwritten extent
> >>   [UUUUUUUUUUUUUUUU] extent status tree
> >>   |<-   ->| ----> dio write this range
> >>
> >> First, when ext4_split_extent_at() splits this extent, it truncates the
> >> existing extent and then inserts a new one. During this process, this
> >> extent status entry may be shrunk, and calls to ext4_find_extent() and
> >> ext4_cache_extents() may occur, which could potentially insert the
> >> truncated range as a hole into the extent status tree. After the split
> >> is completed, this hole is not replaced with the correct status.
> >>
> >>   [UUUUUUU|UUUUUUUU] on-disk extent        U: unwritten extent
> >>   [UUUUUUU|HHHHHHHH] extent status tree    H: hole
> >>
> >> Then, the outer calling functions will not correct this remaining hole
> >> extent either. Finally, if we perform a delayed buffer write on this
> >> latter part, it will re-insert the delayed extent and cause an error in
> >> space accounting.
> > 
> > Okay, makes sense. So one basic question, I see that in
> > ext4_ext_insert_extent() doesnt really care about updating es unless as a
> > side effect of ext4_find_extent().  For example, if we end up at goto
> > has_space; we don't add the new extent and niether do we update that the
> > exsisting extent has shrunk. 
> > 
> > Similarly, the splitting code also doesn't seem to care about the es
> > cache other than zeroout in the error handling.
> > 
> > Is there a reason for this? Do we expect the upper layers to maintain
> > the es cache?
> 
> Yeah, if we don't consider the zeroout case caused by a failed split,
> under typical circumstances, the ext4_es_insert_extent() in
> ext4_map_create_blocks() is called to insert or update the cached
> extent entries.
> 
> However, ext4_map_create_blocks() only insert or update
> the range that the caller want to map, it can't know the actual
> initialized range if this extent has been zeroed out, so we have to
> update the extent cache in ext4_split_extent_at() for this special case.
> Please see commit adb2355104b2 ("ext4: update extent status tree after
> an extent is zeroed out") for details.
> 
> Unfortunately, the legacy scenario described in this patch remains
> unhandled.

Got it thanks for the details.

Feel free to add:
Reviewed-by: Ojaswin Mujoo <ojaswin@...ux.ibm.com>

Regards,
ojaswin

> 
> Cheers,
> Yi.
> 
> >>
> >> In adition, if the unwritten extent cache is not shrunk during the
> >> splitting, ext4_cache_extents() also conflicts with existing extents
> >> when caching extents. In the future, we will add checks when caching
> >> extents, which will trigger a warning. Therefore, Do not cache extents
> >> that are being split.
> >>
> >> Signed-off-by: Zhang Yi <yi.zhang@...wei.com>
> >> ---
> >>  fs/ext4/extents.c | 6 ++++++
> >>  1 file changed, 6 insertions(+)
> >>
> >> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
> >> index 19338f488550..2b5aec3f8882 100644
> >> --- a/fs/ext4/extents.c
> >> +++ b/fs/ext4/extents.c
> >> @@ -3199,6 +3199,9 @@ static struct ext4_ext_path *ext4_split_extent_at(handle_t *handle,
> >>  	BUG_ON((split_flag & EXT4_EXT_DATA_VALID1) &&
> >>  	       (split_flag & EXT4_EXT_DATA_VALID2));
> >>  
> >> +	/* Do not cache extents that are in the process of being modified. */
> >> +	flags |= EXT4_EX_NOCACHE;
> >> +
> >>  	ext_debug(inode, "logical block %llu\n", (unsigned long long)split);
> >>  
> >>  	ext4_ext_show_leaf(inode, path);
> >> @@ -3364,6 +3367,9 @@ static struct ext4_ext_path *ext4_split_extent(handle_t *handle,
> >>  	ee_len = ext4_ext_get_actual_len(ex);
> >>  	unwritten = ext4_ext_is_unwritten(ex);
> >>  
> >> +	/* Do not cache extents that are in the process of being modified. */
> >> +	flags |= EXT4_EX_NOCACHE;
> >> +
> >>  	if (map->m_lblk + map->m_len < ee_block + ee_len) {
> >>  		split_flag1 = split_flag & EXT4_EXT_MAY_ZEROOUT;
> >>  		flags1 = flags | EXT4_GET_BLOCKS_PRE_IO;
> >> -- 
> >> 2.46.1
> >>
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ