[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bff16d9e-d6e7-4d0e-9a58-6db37ec58ce7@ustc.edu>
Date: Fri, 16 Jan 2026 10:43:43 +0800
From: Chunsheng Luo <luochunsheng@...c.edu>
To: Amir Goldstein <amir73il@...il.com>
Cc: miklos@...redi.hu, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, paullawrence@...gle.com
Subject: Re: [RFC 2/2] fuse: Add new flag to reuse the backing file of
fuse_inode
On 1/15/26 11:31 PM, Amir Goldstein wrote:
> On Thu, Jan 15, 2026 at 3:35 PM Chunsheng Luo <luochunsheng@...c.edu> wrote:
>>
>>
>>
>> On 1/15/26 9:09 PM, Amir Goldstein wrote:
>>> Hi Chunsheng,
>>>
>>> Please CC me for future fuse passthrough patch sets.
>>>
>> Ok.
>>
>>> On Thu, Jan 15, 2026 at 03:20:31PM +0800, Chunsheng Luo wrote:
>>>> To simplify crash recovery and reduce performance impact, backing_ids
>>>> are not persisted across daemon restarts. However, this creates a
>>>> problem: when the daemon restarts and a process opens the same FUSE
>>>> file, a new backing_id may be allocated for the same backing file. If
>>>> the inode already has a cached backing file from before the restart,
>>>> subsequent open requests with the new backing_id will fail in
>>>> fuse_inode_uncached_io_start() due to fb mismatch, even though both
>>>> IDs reference the identical underlying file.
>>>
>>> I don't think that your proposal makes this guaranty.
>>>
>>
>> Yes, this proposal does not apply to all situations.
>>
>>>>
>>>> Introduce the FOPEN_PASSTHROUGH_INODE_CACHE flag to address this
>>>> issue. When set, the kernel reuses the backing file already cached in
>>>> the inode.
>>>>
>>>> Signed-off-by: Chunsheng Luo <luochunsheng@...c.edu>
>>>> ---
>>>> fs/fuse/iomode.c | 2 +-
>>>> fs/fuse/passthrough.c | 11 +++++++++++
>>>> include/uapi/linux/fuse.h | 2 ++
>>>> 3 files changed, 14 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/fs/fuse/iomode.c b/fs/fuse/iomode.c
>>>> index 3728933188f3..b200bb248598 100644
>>>> --- a/fs/fuse/iomode.c
>>>> +++ b/fs/fuse/iomode.c
>>>> @@ -163,7 +163,7 @@ static void fuse_file_uncached_io_release(struct fuse_file *ff,
>>>> */
>>>> #define FOPEN_PASSTHROUGH_MASK \
>>>> (FOPEN_PASSTHROUGH | FOPEN_DIRECT_IO | FOPEN_PARALLEL_DIRECT_WRITES | \
>>>> - FOPEN_NOFLUSH)
>>>> + FOPEN_NOFLUSH | FOPEN_PASSTHROUGH_INODE_CACHE)
>>>>
>>>> static int fuse_file_passthrough_open(struct inode *inode, struct file *file)
>>>> {
>>>> diff --git a/fs/fuse/passthrough.c b/fs/fuse/passthrough.c
>>>> index 72de97c03d0e..fde4ac0c5737 100644
>>>> --- a/fs/fuse/passthrough.c
>>>> +++ b/fs/fuse/passthrough.c
>>>> @@ -147,16 +147,26 @@ ssize_t fuse_passthrough_mmap(struct file *file, struct vm_area_struct *vma)
>>>> /*
>>>> * Setup passthrough to a backing file.
>>>> *
>>>> + * If fuse inode backing is provided and FOPEN_PASSTHROUGH_INODE_CACHE flag
>>>> + * is set, try to reuse it first before looking up backing_id.
>>>> + *
>>>> * Returns an fb object with elevated refcount to be stored in fuse inode.
>>>> */
>>>> struct fuse_backing *fuse_passthrough_open(struct file *file, int backing_id)
>>>> {
>>>> struct fuse_file *ff = file->private_data;
>>>> struct fuse_conn *fc = ff->fm->fc;
>>>> + struct fuse_inode *fi = get_fuse_inode(file->f_inode);
>>>> struct fuse_backing *fb = NULL;
>>>> struct file *backing_file;
>>>> int err;
>>>>
>>>> + if (ff->open_flags & FOPEN_PASSTHROUGH_INODE_CACHE) {
>>>> + fb = fuse_backing_get(fuse_inode_backing(fi));
>>>> + if (fb)
>>>> + goto do_open;
>>>> + }
>>>> +
>>>
>>> Maybe an explicit FOPEN_PASSTHROUGH_INODE_CACHE flag is a good idea,
>>> but just FYI, I intentionally reserved backing_id 0 for this purpose.
>>> For example, for setting up the backing id on lookup [1] and then
>>> open does not need to specify the backing_id.
>>>
>>> [1] https://lore.kernel.org/linux-fsdevel/20250804173228.1990317-1-paullawrence@google.com/
>>>
>>
>> This is a great idea. However, we need to consider the lifecycle
>> management of the backing file associated with a FUSE inode.
>> Specifically, will the same backing_idbe retained for the entire
>> lifetime of the FUSE inode until it is deleted?
>
> It's not a good fit for servers that want to change the backing file
> (like re-download). For these servers we have the existing file
> open-to-close life cycle.
>
>>
>> Additionally, since each backing_idcorresponds to an open file
>> descriptor (fd) for the backing file, if a fuse_inode holds onto a
>> backing_id indefinitely without a suitable release mechanism, could this
>> accumulation of file descriptors cause the process to exceed its open
>> files limit?
>>
>
> There is no such accumulation.
> fuse_inode refers to a single fuse_backing object.
> fuse_file refers to a single fuse_backing object.
> It can be the same (refcounted) object.
>
Sorry, I wasn't referring to `fuse_backing` refs.
If the lifecycle of `fuse_backing` is the same as `fuse_inode`, and
there are a large number of FUSE files on the file system, then when I
iterate through and open the backing files, register the `fuse_backing`,
and then set it to the `fuse_inode`, the FUSE service will hold a large
number of backing file file descriptors (FDs). These backing file FDs
will only be released when the FUSE files are deleted.
For example, if there are 1000 FUSE files on the file system, and I
iterate through and set the backing file for each `fuse_inode`, then the
FUSE service will hold 1000 backing file FDs for a long time. Extending
this further, if there are even more files, could the FUSE service
process exceed the `ulimit` configuration for open files?
```shell
[root@...alhost home]# ulimit -a |grep "open files"
open files (-n) 1024
```
>>> But what you are proposing is a little bit odd API IMO:
>>> "Use this backing_id with this backing file, unless you find another
>>> backing file so use that one instead" - this sounds a bit awkward to me.
>>>
>>> I think it would be saner and simpler to relax the check in
>>> fuse_inode_uncached_io_start() to check that old and new fuse_backing
>>> objects refer to the same backing inode:
>>>
>>> diff --git a/fs/fuse/iomode.c b/fs/fuse/iomode.c
>>> index 3728933188f30..c6070c361d855 100644
>>> --- a/fs/fuse/iomode.c
>>> +++ b/fs/fuse/iomode.c
>>> @@ -88,9 +88,9 @@ int fuse_inode_uncached_io_start(struct fuse_inode *fi, struct fuse_backing *fb)
>>> int err = 0;
>>>
>>> spin_lock(&fi->lock);
>>> - /* deny conflicting backing files on same fuse inode */
>>> + /* deny conflicting backing inodes on same fuse inode */
>>> oldfb = fuse_inode_backing(fi);
>>> - if (fb && oldfb && oldfb != fb) {
>>> + if (fb && oldfb && file_inode(oldfb->file) != file_inode(fb->file)) {
>>> err = -EBUSY;
>>> goto unlock;
>>> }
>>> --
>>>
>>> I don't think that this requires opt-in flag.
>>>
>>> Thanks,
>>> Amir.
>>
>> I agree that modifying the condition to `file_inode(oldfb->file) !=
>> file_inode(fb->file)` is a reasonable fix, and it does address the first
>> scenario I described.
>>
>> However, it doesn't fully resolve the second scenario: in a read-only
>> FUSE filesystem, the backing file itself might be cleaned up and
>> re-downloaded (resulting in a new inode with identical content). In this
>> case, reusing the cached fuse_inode's fb after a daemon restart still be
>> safe, but the inode comparison would incorrectly reject it. Is there a
>> more robust approach for handling this scenario?
>>
>
> There is a reason we added the restriction against associating
> fuse file to different backing inodes.
>
> mmap and reads from different files to the same inode need to be
> cache coherent.
>
> IOW, we intentionally do not support this setup without server restart
> there is no reason for us to allow that after server restarts because
> the consequense will be the same.
>
> It does not sound like a good idea for the server to cleanup files
> that are currently opened via fuse passthrough - is that something
> that happens intentionally? after server restarts?
>
> You could try to take a write lease to check if the file is currently
> open for read/write to avoid cleanup in this case?
>
> Thanks,
> Amir.
>
>
Yes, it happened after the fuse service crash recovery restart, because
the refs of the backup files were cleaned up, causing them to be
mistakenly garbage collected.
I will consider how to prevent it from being mistakenly garbage
collected by the fuse server.
I will resend the patch, modifying only the condition in
`fuse_inode_uncached_io_start`, changing it to `file_inode(oldfb->file)
!= file_inode(fb->file)`.
Thanks.
Chunsheng Luo
Powered by blists - more mailing lists