| lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
|
Open Source and information security mailing list archives
| ||
|
Message-ID: <bb5b9323-d7ea-467b-a59f-ef6e415e766a@huaweicloud.com> Date: Fri, 28 Nov 2025 12:37:33 +0800 From: Zhang Yi <yi.zhang@...weicloud.com> To: Jan Kara <jack@...e.cz> Cc: Ojaswin Mujoo <ojaswin@...ux.ibm.com>, linux-ext4@...r.kernel.org, linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org, tytso@....edu, adilger.kernel@...ger.ca, yi.zhang@...wei.com, yizhang089@...il.com, libaokun1@...wei.com, yangerkun@...wei.com Subject: Re: [PATCH v2 00/13] ext4: replace ext4_es_insert_extent() when caching on-disk extents On 11/27/2025 8:24 PM, Jan Kara wrote: > Hi Yi! > > On Mon 24-11-25 13:04:04, Zhang Yi wrote: >> On 11/23/2025 6:55 PM, Ojaswin Mujoo wrote: >>> On Fri, Nov 21, 2025 at 02:07:58PM +0800, Zhang Yi wrote: >>>> Changes since v1: >>>> - Rebase the codes based on the latest linux-next 20251120. >>>> - Add patches 01-05, fix two stale data problems caused by >>> >>> Hi Zhang, thanks for the patches. >>> >> >> Thank you for take time to look at this series. >> >>> I've always felt uncomfortable with the ZEROOUT code here because it >>> seems to have many such bugs as you pointed out in the series. Its very >>> fragile and the bugs are easy to miss behind all the data valid and >>> split flags mess. >>> >> >> Yes, I agree with you. The implementation of EXT4_EXT_MAY_ZEROOUT has >> significantly increased the complexity of split extents and the >> potential for bugs. > > Yep, that code is complex and prone to bugs. > >>> As per my understanding, ZEROOUT logic seems to be a special best-effort >>> try to make the split/convert operation "work" when dealing with >>> transient errors like ENOSPC etc. I was just wondering if it makes sense >>> to just get rid of the whole ZEROOUT logic completely and just reset the >>> extent to orig state if there is any error. This allows us to get rid of >>> DATA_VALID* flags as well and makes the whole ext4_split_convert_extents() >>> slightly less messy. >>> >>> Maybe we can have a retry loop at the top level caller if we want to try >>> again for say ENOSPC or ENOMEM. >>> >>> Would love to hear your thoughts on it. >> >> I think this is a direction worth exploring. However, what I am >> currently considering is that we need to address this scenario of >> splitting extent during the I/O completion. Although the ZEROOUT logic >> is fragile and has many issues recently, it currently serves as a >> fallback solution for handling ENOSPC errors that arise when splitting >> extents during I/O completion. It ensures that I/O operations do not >> fail due to insufficient extent blocks. > > Also partial extent zeroout offers a good performance win when the > portion needing zeroout is small (we can save extent splitting). And I > agree it is a good safety net for ENOSPC issues - otherwise there's no > guarantee page writeback can finish without hitting ENOSPC. We do have > reserved blocks for these cases but the pool is limited so you can still > run out of blocks if you try hard enough. Yes, Indeed! > >> Please see ext4_convert_unwritten_extents_endio(). Although we have made >> our best effort to tried to split extents using >> EXT4_GET_BLOCKS_IO_CREATE_EXT before issuing I/Os, we still have not >> covered all scenarios. Moreover, after converting the buffered I/O path >> to the iomap infrastructure in the future, we may need to split extents >> during the I/O completion worker[1]. > > Yes, this might be worth exploring. The advantage of doing extent splitting > in advance is that on IO submission you have the opportunity of restarting > the transaction on ENOSPC to possibly release some blocks. This is not > easily doable e.g. on writeback completion so the pressure on the pool of > reserved blocks is going to be more common (previously you needed reserved > blocks only when writeback was racing with fallocate or similar, now you > may need them each time you write in the middle of unwritten extent). So I > think the change will need some testing whether it isn't too easy to hit > ENOSPC conditions on IO completion without EXT4_GET_BLOCKS_IO_CREATE_EXT. > But otherwise it's always nice to remove code :) > > Honza Yes, we need to be very careful about this, I have written a POC patch and am currently conducting related tests. I hope it works. :-) Thanks, Yi.
Powered by blists - more mailing lists