lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20260214212452.782265-44-sashal@kernel.org>
Date: Sat, 14 Feb 2026 16:23:09 -0500
From: Sasha Levin <sashal@...nel.org>
To: patches@...ts.linux.dev,
	stable@...r.kernel.org
Cc: Zhang Yi <yi.zhang@...wei.com>,
	Jan Kara <jack@...e.cz>,
	Baokun Li <libaokun1@...wei.com>,
	Ojaswin Mujoo <ojaswin@...ux.ibm.com>,
	Theodore Ts'o <tytso@....edu>,
	Sasha Levin <sashal@...nel.org>,
	adilger.kernel@...ger.ca,
	linux-ext4@...r.kernel.org
Subject: [PATCH AUTOSEL 6.19-6.12] ext4: use reserved metadata blocks when splitting extent on endio

From: Zhang Yi <yi.zhang@...wei.com>

[ Upstream commit 01942af95ab6c9d98e64ae01fdc243a03e4b973f ]

When performing buffered writes, we may need to split and convert an
unwritten extent into a written one during the end I/O process. However,
we do not reserve space specifically for these metadata changes, we only
reserve 2% of space or 4096 blocks. To address this, we use
EXT4_GET_BLOCKS_PRE_IO to potentially split extents in advance and
EXT4_GET_BLOCKS_METADATA_NOFAIL to utilize reserved space if necessary.

These two approaches can reduce the likelihood of running out of space
and losing data. However, these methods are merely best efforts, we
could still run out of space, and there is not much difference between
converting an extent during the writeback process and the end I/O
process, it won't increase the risk of losing data if we postpone the
conversion.

Therefore, also use EXT4_GET_BLOCKS_METADATA_NOFAIL in
ext4_convert_unwritten_extents_endio() to prepare for the buffered I/O
iomap conversion, which may perform extent conversion during the end I/O
process.

Signed-off-by: Zhang Yi <yi.zhang@...wei.com>
Reviewed-by: Jan Kara <jack@...e.cz>
Reviewed-by: Baokun Li <libaokun1@...wei.com>
Reviewed-by: Ojaswin Mujoo <ojaswin@...ux.ibm.com>
Link: https://patch.msgid.link/20260105014522.1937690-2-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@....edu>
Signed-off-by: Sasha Levin <sashal@...nel.org>
---

LLM Generated explanations, may be completely bogus:

The commit message says "to prepare for the buffered I/O iomap
conversion, which may perform extent conversion during the end I/O
process." This suggests it's preparation for a future change. However,
the bug exists **independently** — even in the current code path through
`ext4_ext_handle_unwritten_extents`, the flag is set at line 3908 but
the `ext4_convert_unwritten_extents_endio` function hardcodes only
`EXT4_GET_BLOCKS_CONVERT` when calling `ext4_split_convert_extents` at
line 3779-3780.

Wait — let me re-read the flow more carefully. When
`ext4_ext_handle_unwritten_extents()` calls
`ext4_convert_unwritten_extents_endio()`, the flags variable with
`METADATA_NOFAIL` is local to `ext4_ext_handle_unwritten_extents()`.
`ext4_convert_unwritten_extents_endio()` doesn't receive those flags as
a parameter — it constructs its own flags (`EXT4_GET_BLOCKS_CONVERT`)
internally at line 3780. So **the METADATA_NOFAIL flag is NOT
propagated** to the split operation inside
`ext4_convert_unwritten_extents_endio()`.

This is a real bug that exists in the current codebase, not just a
preparation for future code. The split operation during endio can fail
with ENOSPC because it doesn't use reserved metadata blocks.

### 3. Classification

**Bug fix**: Prevents potential data loss on near-full ext4 filesystems
when extent splitting is needed during endio. When the filesystem is
nearly full, the extent conversion can fail because it doesn't tap into
the reserved metadata pool. This failure at endio means written data may
appear as unwritten (zeroed), which is **data loss**.

### 4. Scope and Risk Assessment

- **Lines changed**: ~5 lines (adding one flag to an existing call)
- **Files changed**: 1 (fs/ext4/extents.c)
- **Risk**: Extremely low. The `EXT4_GET_BLOCKS_METADATA_NOFAIL` flag is
  already used elsewhere in the same function's caller
  (`ext4_ext_handle_unwritten_extents`). This just ensures the flag is
  also used when the called function internally needs to split extents.
- **Subsystem**: ext4 — the most widely used Linux filesystem. Affects
  everyone.

### 5. User Impact

- **Who**: Any user with an ext4 filesystem that is near-full performing
  buffered writes where extent splitting is needed during endio
- **Severity**: Data loss — written data appears zeroed because the
  extent remains marked as unwritten
- **Likelihood**: Increases as filesystem fills up; real-world scenario
  on busy servers

### 6. Stability Indicators

- **Reviewed-by**: Jan Kara (ext4 co-maintainer), Baokun Li, Ojaswin
  Mujoo — three reviewers
- **Committed-by**: Theodore Ts'o (ext4 maintainer)
- This level of review indicates high confidence in the fix

### 7. Dependency Check

The change is entirely self-contained. It only adds an existing flag
(`EXT4_GET_BLOCKS_METADATA_NOFAIL`) to an existing function call. No
dependencies on other commits. The affected code
(`ext4_convert_unwritten_extents_endio`) has been in the kernel for many
years and exists in all stable trees.

### Summary

This is a small, surgical fix for a real data loss scenario in ext4 —
the most widely used Linux filesystem. When the filesystem is near full,
extent conversion during endio can fail because it doesn't use the
reserved metadata block pool. The fix adds a single flag
(`EXT4_GET_BLOCKS_METADATA_NOFAIL`) that was already supposed to be
propagated but wasn't. It's been reviewed by three ext4 experts
including the subsystem maintainer. The risk is minimal and the benefit
is preventing data loss.

**YES**

 fs/ext4/extents.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 2cf5759ba6894..f1322f64071ff 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3770,6 +3770,8 @@ ext4_convert_unwritten_extents_endio(handle_t *handle, struct inode *inode,
 	 * illegal.
 	 */
 	if (ee_block != map->m_lblk || ee_len > map->m_len) {
+		int flags = EXT4_GET_BLOCKS_CONVERT |
+			    EXT4_GET_BLOCKS_METADATA_NOFAIL;
 #ifdef CONFIG_EXT4_DEBUG
 		ext4_warning(inode->i_sb, "Inode (%ld) finished: extent logical block %llu,"
 			     " len %u; IO logical block %llu, len %u",
@@ -3777,7 +3779,7 @@ ext4_convert_unwritten_extents_endio(handle_t *handle, struct inode *inode,
 			     (unsigned long long)map->m_lblk, map->m_len);
 #endif
 		path = ext4_split_convert_extents(handle, inode, map, path,
-						EXT4_GET_BLOCKS_CONVERT, NULL);
+						  flags, NULL);
 		if (IS_ERR(path))
 			return path;
 
-- 
2.51.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ