[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251106133530.12927-1-hans.holmberg@wdc.com>
Date: Thu, 6 Nov 2025 14:35:30 +0100
From: Hans Holmberg <hans.holmberg@....com>
To: linux-xfs@...r.kernel.org
Cc: Carlos Maiolino <cem@...nel.org>,
Dave Chinner <david@...morbit.com>,
"Darrick J . Wong" <djwong@...nel.org>,
Christoph Hellwig <hch@....de>,
linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org,
libc-alpha@...rceware.org,
Hans Holmberg <hans.holmberg@....com>
Subject: [RFC] xfs: fake fallocate success for always CoW inodes
We don't support preallocations for CoW inodes and we currently fail
with -EOPNOTSUPP, but this causes an issue for users of glibc's
posix_fallocate[1]. If fallocate fails, posix_fallocate falls back on
writing actual data into the range to try to allocate blocks that way.
That does not actually gurantee anything for CoW inodes however as we
write out of place.
So, for this case, users of posix_fallocate will end up writing data
unnecessarily AND be left with a broken promise of being able to
overwrite the range without ending up with -ENOSPC.
So, to avoid the useless data copy that just increases the risk of
-ENOSPC, warn the user and fake that the allocation was successful.
User space using fallocate[2] for preallocation will now be notified of
the missing support for CoW inodes via a logged warning in stead of via
the return value. This is not great, but having posix_fallocate write
useless data and still not guarantee overwrites is arguably worse.
A mount option to choose between these two evils would be good to add,
but we would need to agree on the default value first.
[1] https://man7.org/linux/man-pages/man3/posix_fallocate.3.html
[2] https://man7.org/linux/man-pages/man2/fallocate.2.html
Signed-off-by: Hans Holmberg <hans.holmberg@....com>
---
fs/xfs/xfs_bmap_util.c | 15 ++++++++++++++-
fs/xfs/xfs_file.c | 7 -------
2 files changed, 14 insertions(+), 8 deletions(-)
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 06ca11731e43..ff7f6aa41fc8 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -659,8 +659,21 @@ xfs_alloc_file_space(
xfs_bmbt_irec_t imaps[1], *imapp;
int error;
- if (xfs_is_always_cow_inode(ip))
+ /*
+ * If always_cow mode we can't use preallocations and thus should not
+ * create them.
+ */
+ if (xfs_is_always_cow_inode(ip)) {
+ /*
+ * In stead of failing the fallocate, pretend it was successful
+ * to avoid glibc posix_fallocate to fall back on writing actual
+ * data that won't guarantee that the range can be overwritten
+ * either.
+ */
+ xfs_warn_once(mp,
+"Always CoW inodes do not support preallocations, faking fallocate success.");
return 0;
+ }
trace_xfs_alloc_file_space(ip);
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 2702fef2c90c..91e2693873c0 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1312,13 +1312,6 @@ xfs_falloc_allocate_range(
loff_t new_size = 0;
int error;
- /*
- * If always_cow mode we can't use preallocations and thus should not
- * create them.
- */
- if (xfs_is_always_cow_inode(XFS_I(inode)))
- return -EOPNOTSUPP;
-
error = xfs_falloc_newsize(file, mode, offset, len, &new_size);
if (error)
return error;
--
2.34.1
Powered by blists - more mailing lists