lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <871pw5zdh9.fsf@igalia.com>
Date: Mon, 10 Feb 2025 21:39:14 +0000
From: Luis Henriques <luis@...lia.com>
To: Bernd Schubert <bschubert@....com>
Cc: Joanne Koong <joannelkoong@...il.com>,  Miklos Szeredi
 <miklos@...redi.hu>,  linux-fsdevel@...r.kernel.org,
  linux-kernel@...r.kernel.org,  Matt Harvey <mharvey@...ptrading.com>
Subject: Re: [RFC PATCH v3] fuse: add new function to invalidate cache for
 all inodes

Hi Joanne

Bernd has already added a few comments.  I'm just adding a few more on
top.

On Mon, Feb 10 2025, Bernd Schubert wrote:

> On 2/10/25 20:33, Joanne Koong wrote:
>> On 2/10/25 6:33 AM, Luis Henriques wrote:
>>> Currently userspace is able to notify the kernel to invalidate the cache
>>> for an inode.  This means that, if all the inodes in a filesystem need to
>>> be invalidated, then userspace needs to iterate through all of them
>>> and do
>>> this kernel notification separately.
>>>
>>> This patch adds a new option that allows userspace to invalidate all the
>>> inodes with a single notification operation.  In addition to
>>> invalidate all
>>> the inodes, it also shrinks the sb dcache.
>>>
>>> Signed-off-by: Luis Henriques <luis@...lia.com>
>>> ---
>>> * Changes since v2
>>> Use the new helper from fuse_reverse_inval_inode(), as suggested by
>>> Bernd.
>>>
>>> Also updated patch description as per checkpatch.pl suggestion.
>>>
>>> * Changes since v1
>>> As suggested by Bernd, this patch v2 simply adds an helper function that
>>> will make it easier to replace most of it's code by a call to function
>>> super_iter_inodes() when Dave Chinner's patch[1] eventually gets merged.
>>>
>>> [1] https://lore.kernel.org/r/20241002014017.3801899-3-
>>> david@...morbit.com
>>>
>>>   fs/fuse/inode.c           | 67 +++++++++++++++++++++++++++++++++++----
>>>   include/uapi/linux/fuse.h |  3 ++
>>>   2 files changed, 63 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
>>> index e9db2cb8c150..45b9fbb54d42 100644
>>> --- a/fs/fuse/inode.c
>>> +++ b/fs/fuse/inode.c
>>> @@ -547,25 +547,78 @@ struct inode *fuse_ilookup(struct fuse_conn *fc,
>>> u64 nodeid,
>>>       return NULL;
>>>   }
>>>   +static void inval_single_inode(struct inode *inode, struct
>>> fuse_conn *fc)
>>> +{
>>> +    struct fuse_inode *fi;
>>> +
>>> +    fi = get_fuse_inode(inode);
>>> +    spin_lock(&fi->lock);
>>> +    fi->attr_version = atomic64_inc_return(&fc->attr_version);
>>> +    spin_unlock(&fi->lock);
>>> +    fuse_invalidate_attr(inode);
>>> +    forget_all_cached_acls(inode);
>>> +}
>>> +
>>> +static int fuse_reverse_inval_all(struct fuse_conn *fc)
>>> +{
>>> +    struct fuse_mount *fm;
>>> +    struct super_block *sb;
>>> +    struct inode *inode, *old_inode = NULL;
>>> +
>>> +    inode = fuse_ilookup(fc, FUSE_ROOT_ID, NULL);
>>> +    if (!inode)
>>> +        return -ENOENT;
>>> +
>>> +    fm = get_fuse_mount(inode);
>> 
>> I think if you pass in &fm as the 3rd arg to fuse_ilookup(), it'll pass
>> back the fuse mount and we won't need get_fuse_mount().

Yeah, good catch!  That makes sense, I didn't noticed that this third
argument could be used that way.  I'll make sure next rev simplifies this
code.

>>> +    iput(inode);
>>> +    if (!fm)
>>> +        return -ENOENT;
>>> +    sb = fm->sb;
>>> +
>>> +    spin_lock(&sb->s_inode_list_lock);
>>> +    list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
>>> +        spin_lock(&inode->i_lock);
>>> +        if ((inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) ||
>>> +            !atomic_read(&inode->i_count)) {
>> 
>> Will inode->i_count ever be 0? AFAIU, inode->i_count tracks the inode
>> refcount, so if this is 0, doesn't this mean it wouldn't be on the sb-
>>>s_inodes list?

My point in having this check is that, while iterating the inodes list,
the inode may be iput()'ed before we actually have the chance to lock it.
This is simply to skip the (unlikely) case when this happens.  Maybe it's
not really necessary, maybe the previous checks (the ->i_state) are
enough.  But I've seen this pattern in other places (for example, in
evict_inodes()).

>>> +            spin_unlock(&inode->i_lock);
>>> +            continue;
>>> +        }
>>> +
>>> +        __iget(inode);
>>> +        spin_unlock(&inode->i_lock);
>>> +        spin_unlock(&sb->s_inode_list_lock);
>> 
>> Maybe worth adding a comment here since there can be inodes added after
>> the s_inode_list_lock is dropped and before it's acquired again that
>> when inodes get added to the head of sb->s_inodes, it's always for I_NEW
>> inodes.
>> 
>>> +        iput(old_inode);
>> 
>> Maybe a dumb question but why is old_inode needed? Why can't iput()just
>> be called right after inval_single_inode()?
>
> I had wondered the same in v1. Issue is that there is a list iteration
> that releases the locks - if the put would be done immediately it could
> not continue on "old_inode" as it might not exist anymore.

Exactly, thanks Bernd.  We release the locks and do the cond_resched()
below for the cases where we have a huge amount of inodes to invalidate.

I'll prepare a new rev with some extra code comments (including those
suggested by Bernd before) and remove the call to get_fuse_mount().

Thank you both for your feedback.  Much appreciated!

Cheers,
-- 
Luís

>>> +
>>> +        inval_single_inode(inode, fc);
>>> +
>>> +        old_inode = inode;
>>> +        cond_resched();
>> 
>> Could you explain why a cond_resched() is needed here?
>
> Give other threads a chance to work? The list might be huge?
>
>
>
> Thanks,
> Bernd


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ