linux-ext4 - Re: [PATCH v2 03/13] ext4: don't zero the entire extent if EXT4_EXT_DATA_PARTIAL

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ihvyl3ayookm5b2tcjz63tfhdobn64lbzudiv7w3hezs6ykzyd@p2euigrkhmxm>
Date: Fri, 28 Nov 2025 11:58:47 +0100
From: Jan Kara <jack@...e.cz>
To: Zhang Yi <yi.zhang@...weicloud.com>
Cc: Jan Kara <jack@...e.cz>, linux-ext4@...r.kernel.org, 
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org, tytso@....edu, 
	adilger.kernel@...ger.ca, yi.zhang@...wei.com, yizhang089@...il.com, libaokun1@...wei.com, 
	yangerkun@...wei.com
Subject: Re: [PATCH v2 03/13] ext4: don't zero the entire extent if
 EXT4_EXT_DATA_PARTIAL_VALID1

On Fri 28-11-25 11:45:51, Zhang Yi wrote:
> On 11/27/2025 9:41 PM, Jan Kara wrote:
> > I think the code would be much clearer
> > if we just centralized all the zeroing in ext4_split_extent(). At that
> > place the situation is actually pretty simple:
> 
> Thank you for your suggestion!
> 
> > 
> > 1) 'ex' is unwritten, 'map' describes part with already written data which
> > we want to convert to initialized (generally IO completion situation) => we
> > can zero out boundaries if they are smaller than max_zeroout or if extent
> > split fails.
> > 
> 
> Yes. Agree.
> 
> > 2) 'ex' is unwritten, 'map' describes part we are preparing for write (IO
> > submission) => the split is opportunistic here, if we cannot split due to
> > ENOSPC, just go on and deal with it at IO completion time. No zeroing
> > needed.
> 
> Yes. At the same time, if we can indeed move the entire split unwritten
> operation to be handled after I/O completion in the future, it would also be
> more convenient to remove this segment of logic.

Yes.

> > 3) 'ex' is written, 'map' describes part that should be converted to
> > unwritten => we can zero out the 'map' part if smaller than max_zeroout or
> > if extent split fails.
> 
> This makes sense to me! This case it originates from the fallocate with Zero
> Range operation. Currently, the zero-out operation will not be performed if
> the split operation fails, instead, it immediately returns a failure.
> 
> I agree with you that we can do zero out if the 'map' part smaller than
> max_zeroout instead of split extents. However, if the 'map' part is bigger
> than max_zeroout and if extent split fails, I don't think zero out is a good
> idea, Because it might cause zero-range calls to take a long time to execute.
> Although fallocate doesn't explicitly specify how ZERO_RANGE should be
> implemented, users expect it to be very fast. Therefore, in this case, if the
> split fails, it would be better to simply return an error, leave things as
> they are. What do you think?

True. Just returning the error is a good option in this case.

> > This should all result in a relatively straightforward code where we can
> > distinguish the three cases based on 'ex' and passed flags, we should be
> > able to drop the 'EXT4_EXT_DATA_VALID*' flags and logic (possibly we could
> > drop the 'split_flag' argument of ext4_split_extent() altogether), and fix
> > the data exposure issues at the same time. What do you think? Am I missing
> > some case?
> 
> Indeed, I think the overall solution is a nice cleanup idea. :-)
> But this would involve a significant amount of refactoring and logical changes.
> Could we first merge the current set of patches(it could be more easier to
> backport to the early LTS version), and then I can start a new series to
> address this optimization?

I agree the changes are rather intrusive and the code is complex. Since you
have direct fixes already written, let's merge them first and cleanup
afterwards as you suggest.

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR