[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250118-vfs-libfs-675d6c542bcc@brauner>
Date: Sat, 18 Jan 2025 14:08:14 +0100
From: Christian Brauner <brauner@...nel.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Christian Brauner <brauner@...nel.org>,
linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: [GIT PULL] vfs libfs
Hey Linus,
/* Summary */
This improves the stable directory offset behavior in various ways.
Stable offsets are needed so that NFS can reliably read directories on
filesystems such as tmpfs:
- Improve the end-of-directory detection
According to getdents(3), the d_off field in each returned directory
entry points to the next entry in the directory. The d_off field in
the last returned entry in the readdir buffer must contain a valid
offset value, but if it points to an actual directory entry, then
readdir/getdents can loop.
Introduce a specific fixed offset value that is placed in the d_off
field of the last entry in a directory. Some user space applications
assume that the EOD offset value is larger than the offsets of real
directory entries, so the largest valid offset value is reserved for
this purpose. This new value is never allocated by
simple_offset_add().
When ->iterate_dir() returns, getdents{64} inserts the ctx->pos value
into the d_off field of the last valid entry in the readdir buffer.
When it hits EOD, offset_readdir() sets ctx->pos to the EOD offset
value so the last entry is updated to point to the EOD marker.
When trying to read the entry at the EOD offset, offset_readdir()
terminates immediately.
- Rely on d_children to iterate stable offset directories
Instead of using the mtree to emit entries in the order of their
offset values, use it only to map incoming ctx->pos to a starting
entry. Then use the directory's d_children list, which is already
maintained properly by the dcache, to find the next child to emit.
- Narrow the range of directory offset values returned by
simple_offset_add() to 3 .. (S32_MAX - 1) on all platforms. This means
the allocation behavior is identical on 32-bit systems, 64-bit
systems, and 32-bit user space on 64-bit kernels. The new range still
permits over 2 billion concurrent entries per directory.
- Return ENOSPC when the directory offset range is exhausted. Hitting
this error is almost impossible though.
- Remove the simple_offset_empty() helper.
/* Testing */
gcc version 14.2.0 (Debian 14.2.0-6)
Debian clang version 16.0.6 (27+b1)
No build failures or warnings were observed.
/* Conflicts */
Merge conflicts with mainline
=============================
No known conflicts.
Merge conflicts with other trees
================================
No known conflicts.
The following changes since commit 40384c840ea1944d7c5a392e8975ed088ecf0b37:
Linux 6.13-rc1 (2024-12-01 14:28:56 -0800)
are available in the Git repository at:
git@...olite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.14-rc1.libfs
for you to fetch changes up to a0634b457eca16b21a4525bc40cd2db80f52dadc:
Merge patch series "Improve simple directory offset wrap behavior" (2025-01-04 10:15:58 +0100)
Please consider pulling these changes from the signed vfs-6.14-rc1.libfs tag.
Thanks!
Christian
----------------------------------------------------------------
vfs-6.14-rc1.libfs
----------------------------------------------------------------
Christian Brauner (1):
Merge patch series "Improve simple directory offset wrap behavior"
Chuck Lever (5):
libfs: Return ENOSPC when the directory offset range is exhausted
Revert "libfs: Add simple_offset_empty()"
Revert "libfs: fix infinite directory reads for offset dir"
libfs: Replace simple_offset end-of-directory detection
libfs: Use d_children list to iterate simple_offset directories
fs/libfs.c | 162 +++++++++++++++++++++++++----------------------------
include/linux/fs.h | 1 -
mm/shmem.c | 4 +-
3 files changed, 79 insertions(+), 88 deletions(-)
Powered by blists - more mailing lists