[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3187f942-dcf0-4b2f-a106-0eb5d5a33949@fastmail.fm>
Date: Fri, 6 Oct 2023 19:13:06 +0200
From: Bernd Schubert <bernd.schubert@...tmail.fm>
To: Krister Johansen <kjlx@...pleofstupid.com>,
Miklos Szeredi <miklos@...redi.hu>,
linux-fsdevel@...r.kernel.org
Cc: Miklos Szeredi <mszeredi@...hat.com>, linux-kernel@...r.kernel.org,
German Maglione <gmaglione@...hat.com>,
Greg Kurz <groug@...d.org>, Max Reitz <mreitz@...hat.com>
Subject: Re: [resend PATCH v2 2/2] fuse: ensure that submounts lookup their
parent
On 10/2/23 17:24, Krister Johansen wrote:
> The submount code uses the parent nodeid passed into the function in
> order to create the root dentry for the new submount. This nodeid does
> not get its remote reference count incremented by a lookup option.
>
> If the parent inode is evicted from its superblock, due to memory
> pressure for example, it can result in a forget opertation being sent to
> the server. Should this nodeid be forgotten while it is still in use in
> a submount, users of the submount get an error from the server on any
> subsequent access. In the author's case, this was an EBADF on all
> subsequent operations that needed to reference the root.
>
> Debugging the problem revealed that the dentry shrinker triggered a forget
> after killing the dentry with the last reference, despite the root
> dentry in another superblock still using the nodeid.
>
> As a result, a container that was also using this submount failed to
> access its filesystem because it had borrowed the reference instead of
> taking its own when setting up its superblock for the submount.
>
> This commit fixes the problem by having the new submount trigger a
> lookup for the parent as part of creating a new root dentry for the
> virtiofsd submount superblock. This allows each superblock to have its
> inodes removed by the shrinker when unreferenced, while keeping the
> nodeid reference count accurate and active with the server.
>
> Signed-off-by: Krister Johansen <kjlx@...pleofstupid.com>
> ---
> fs/fuse/dir.c | 10 +++++-----
> fs/fuse/fuse_i.h | 6 ++++++
> fs/fuse/inode.c | 43 +++++++++++++++++++++++++++++++++++++------
> 3 files changed, 48 insertions(+), 11 deletions(-)
>
> diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
> index 5e01946d7531..333730c74619 100644
> --- a/fs/fuse/dir.c
> +++ b/fs/fuse/dir.c
> @@ -183,11 +183,11 @@ static void fuse_lookup_init(struct fuse_conn *fc, struct fuse_args *args,
> args->out_args[0].value = outarg;
> }
>
> -static int fuse_dentry_revalidate_lookup(struct fuse_mount *fm,
> - struct dentry *entry,
> - struct inode *inode,
> - struct fuse_entry_out *outarg,
> - bool *lookedup)
> +int fuse_dentry_revalidate_lookup(struct fuse_mount *fm,
> + struct dentry *entry,
> + struct inode *inode,
> + struct fuse_entry_out *outarg,
> + bool *lookedup)
> {
> struct dentry *parent;
> struct fuse_forget_link *forget;
> diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
> index 405252bb51f2..a66fcf50a4cc 100644
> --- a/fs/fuse/fuse_i.h
> +++ b/fs/fuse/fuse_i.h
> @@ -1325,6 +1325,12 @@ void fuse_dax_dontcache(struct inode *inode, unsigned int flags);
> bool fuse_dax_check_alignment(struct fuse_conn *fc, unsigned int map_alignment);
> void fuse_dax_cancel_work(struct fuse_conn *fc);
>
> +/* dir.c */
> +int fuse_dentry_revalidate_lookup(struct fuse_mount *fm, struct dentry *entry,
> + struct inode *inode,
> + struct fuse_entry_out *outarg,
> + bool *lookedup);
> +
> /* ioctl.c */
> long fuse_file_ioctl(struct file *file, unsigned int cmd, unsigned long arg);
> long fuse_file_compat_ioctl(struct file *file, unsigned int cmd,
> diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
> index 444418e240c8..79a31cb55512 100644
> --- a/fs/fuse/inode.c
> +++ b/fs/fuse/inode.c
> @@ -1464,7 +1464,13 @@ static int fuse_fill_super_submount(struct super_block *sb,
> struct fuse_mount *fm = get_fuse_mount_super(sb);
> struct super_block *parent_sb = parent_fi->inode.i_sb;
> struct fuse_attr root_attr;
> + struct fuse_inode *fi;
> struct inode *root;
> + struct inode *parent;
> + struct dentry *pdent;
> + struct fuse_entry_out outarg;
> + bool lookedup = false;
> + int ret;
>
> fuse_sb_defaults(sb);
> fm->sb = sb;
> @@ -1480,14 +1486,39 @@ static int fuse_fill_super_submount(struct super_block *sb,
> if (parent_sb->s_subtype && !sb->s_subtype)
> return -ENOMEM;
>
> - fuse_fill_attr_from_inode(&root_attr, parent_fi);
> - root = fuse_iget(sb, parent_fi->nodeid, 0, &root_attr, 0, 0);
> /*
> - * This inode is just a duplicate, so it is not looked up and
> - * its nlookup should not be incremented. fuse_iget() does
> - * that, though, so undo it here.
> + * It is necessary to lookup the parent_if->nodeid in case the dentry
> + * that triggered the automount of the submount is later evicted.
> + * If this dentry is evicted without the lookup count getting increased
> + * on the submount root, then the server can subsequently forget this
> + * nodeid which leads to errors when trying to access the root of the
> + * submount.
> */
> - get_fuse_inode(root)->nlookup--;
> + parent = &parent_fi->inode;
> + pdent = d_find_alias(parent);
> + if (!pdent)
> + return -EINVAL;
> +
> + ret = fuse_dentry_revalidate_lookup(fm, pdent, parent, &outarg,
> + &lookedup);
> + dput(pdent);
> + /*
> + * The new root owns this nlookup on success, and it is incremented by
> + * fuse_iget(). In the case the lookup succeeded but revalidate fails,
> + * ensure that the lookup count is tracked by the parent.
> + */
> + if (ret <= 0) {
> + if (lookedup) {
> + fi = get_fuse_inode(parent);
> + spin_lock(&fi->lock);
> + fi->nlookup++;
> + spin_unlock(&fi->lock);
> + }
I might be wrong, but doesn't that mean that
"get_fuse_inode(root)->nlookup--" needs to be called?
Thanks,
Bernd
Powered by blists - more mailing lists