linux-ext4 - [PATCH AUTOSEL 6.16] ext4: limit the maximum folio order

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-Id: <20250805130945.471732-22-sashal@kernel.org>
Date: Tue,  5 Aug 2025 09:08:57 -0400
From: Sasha Levin <sashal@...nel.org>
To: patches@...ts.linux.dev,
	stable@...r.kernel.org
Cc: Zhang Yi <yi.zhang@...wei.com>,
	Naresh Kamboju <naresh.kamboju@...aro.org>,
	Joseph Qi <jiangqi903@...il.com>,
	Joseph Qi <joseph.qi@...ux.alibaba.com>,
	Jan Kara <jack@...e.cz>,
	Theodore Ts'o <tytso@....edu>,
	Sasha Levin <sashal@...nel.org>,
	adilger.kernel@...ger.ca,
	linux-ext4@...r.kernel.org
Subject: [PATCH AUTOSEL 6.16] ext4: limit the maximum folio order

From: Zhang Yi <yi.zhang@...wei.com>

[ Upstream commit b12f423d598fd874df9ecfb2436789d582fda8e6 ]

In environments with a page size of 64KB, the maximum size of a folio
can reach up to 128MB. Consequently, during the write-back of folios,
the 'rsv_blocks' will be overestimated to 1,577, which can make
pressure on the journal space where the journal is small. This can
easily exceed the limit of a single transaction. Besides, an excessively
large folio is meaningless and will instead increase the overhead of
traversing the bhs within the folio. Therefore, limit the maximum order
of a folio to 2048 filesystem blocks.

Reported-by: Naresh Kamboju <naresh.kamboju@...aro.org>
Reported-by: Joseph Qi <jiangqi903@...il.com>
Closes: https://lore.kernel.org/linux-ext4/CA+G9fYsyYQ3ZL4xaSg1-Tt5Evto7Zd+hgNWZEa9cQLbahA1+xg@mail.gmail.com/
Signed-off-by: Zhang Yi <yi.zhang@...wei.com>
Tested-by: Joseph Qi <joseph.qi@...ux.alibaba.com>
Reviewed-by: Jan Kara <jack@...e.cz>
Link: https://patch.msgid.link/20250707140814.542883-12-yi.zhang@huaweicloud.com
Signed-off-by: Theodore Ts'o <tytso@....edu>
Signed-off-by: Sasha Levin <sashal@...nel.org>
---

LLM Generated explanations, may be completely bogus:

**Backport Status: YES**

This commit should be backported to stable kernel trees for the
following reasons:

## Bug Fix Nature
The commit fixes a critical issue where ext4 could exhaust journal space
on systems with 64KB page sizes. The commit message explicitly states:
"Consequently, during the write-back of folios, the 'rsv_blocks' will be
overestimated to 1,577, which can make pressure on the journal space
where the journal is small. This can easily exceed the limit of a single
transaction."

## Real-World Impact
1. **Reported by multiple users**: The commit includes two Reported-by
   tags and a Closes link to a bug report, indicating this is affecting
   real users in production environments
2. **Specific environment failure**: The issue manifests on systems with
   64KB page sizes (common on ARM64 systems), where large folios can
   reach 128MB, causing journal transaction limits to be exceeded

## Minimal and Safe Fix
The fix is clean and contained:
1. **Limited scope**: Changes only affect folio order calculation for
   ext4 filesystems
2. **Conservative approach**: Limits maximum folio order to 2048
   filesystem blocks using the formula `(11 + (i)->i_blkbits -
   PAGE_SHIFT)`
3. **Function refactoring**: Converts `ext4_should_enable_large_folio()`
   from public to static and introduces `ext4_set_inode_mapping_order()`
   as a wrapper, maintaining clean interfaces

## Code Analysis
The changes show:
- Introduction of `EXT4_MAX_PAGECACHE_ORDER()` macro that caps folio
  size
- New function `ext4_set_inode_mapping_order()` using
  `mapping_set_folio_order_range()` instead of the previous
  `mapping_set_large_folios()`
- Updates to both inode allocation (fs/ext4/ialloc.c) and inode
  retrieval (fs/ext4/inode.c) paths

## Stability Considerations
1. **No new features**: This is purely a bug fix that prevents journal
   exhaustion
2. **Backward compatible**: The change doesn't break existing
   functionality
3. **Tested**: Has "Tested-by" tag from Joseph Qi
4. **Reviewed**: Has "Reviewed-by" tag from Jan Kara (experienced
   filesystem maintainer)

## Timeline Context
The large folio support was recently enabled in ext4 (commit
7ac67301e82f from May 2025), and this fix addresses a
regression/oversight in that implementation for systems with large page
sizes. This makes it critical to backport alongside or shortly after the
large folio enablement if that feature is backported.

The fix prevents potential filesystem hangs or write failures on
affected systems, making it an important stability fix for stable
kernels.

 fs/ext4/ext4.h   |  2 +-
 fs/ext4/ialloc.c |  3 +--
 fs/ext4/inode.c  | 22 +++++++++++++++++++---
 3 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 18373de980f2..fe3366e98493 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -3020,7 +3020,7 @@ int ext4_walk_page_buffers(handle_t *handle,
 				     struct buffer_head *bh));
 int do_journal_get_write_access(handle_t *handle, struct inode *inode,
 				struct buffer_head *bh);
-bool ext4_should_enable_large_folio(struct inode *inode);
+void ext4_set_inode_mapping_order(struct inode *inode);
 #define FALL_BACK_TO_NONDELALLOC 1
 #define CONVERT_INLINE_DATA	 2
 
diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index 79aa3df8d019..df4051613b29 100644
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -1335,8 +1335,7 @@ struct inode *__ext4_new_inode(struct mnt_idmap *idmap,
 		}
 	}
 
-	if (ext4_should_enable_large_folio(inode))
-		mapping_set_large_folios(inode->i_mapping);
+	ext4_set_inode_mapping_order(inode);
 
 	ext4_update_inode_fsync_trans(handle, inode, 1);
 
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index be9a4cba35fd..4f4fa62a3bff 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5106,7 +5106,7 @@ static int check_igot_inode(struct inode *inode, ext4_iget_flags flags,
 	return -EFSCORRUPTED;
 }
 
-bool ext4_should_enable_large_folio(struct inode *inode)
+static bool ext4_should_enable_large_folio(struct inode *inode)
 {
 	struct super_block *sb = inode->i_sb;
 
@@ -5123,6 +5123,22 @@ bool ext4_should_enable_large_folio(struct inode *inode)
 	return true;
 }
 
+/*
+ * Limit the maximum folio order to 2048 blocks to prevent overestimation
+ * of reserve handle credits during the folio writeback in environments
+ * where the PAGE_SIZE exceeds 4KB.
+ */
+#define EXT4_MAX_PAGECACHE_ORDER(i)		\
+		umin(MAX_PAGECACHE_ORDER, (11 + (i)->i_blkbits - PAGE_SHIFT))
+void ext4_set_inode_mapping_order(struct inode *inode)
+{
+	if (!ext4_should_enable_large_folio(inode))
+		return;
+
+	mapping_set_folio_order_range(inode->i_mapping, 0,
+				      EXT4_MAX_PAGECACHE_ORDER(inode));
+}
+
 struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
 			  ext4_iget_flags flags, const char *function,
 			  unsigned int line)
@@ -5440,8 +5456,8 @@ struct inode *__ext4_iget(struct super_block *sb, unsigned long ino,
 		ret = -EFSCORRUPTED;
 		goto bad_inode;
 	}
-	if (ext4_should_enable_large_folio(inode))
-		mapping_set_large_folios(inode->i_mapping);
+
+	ext4_set_inode_mapping_order(inode);
 
 	ret = check_igot_inode(inode, flags, function, line);
 	/*
-- 
2.39.5