lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bpevrif4k5h2l4pscsnsj3flwmwdw6w5nge5n7ji2yshk5pz6z@tngefti6stld>
Date: Tue, 18 Feb 2025 12:57:28 +0100
From: Jan Kara <jack@...e.cz>
To: Luis Henriques <luis@...lia.com>
Cc: Bernd Schubert <bschubert@....com>, Miklos Szeredi <miklos@...redi.hu>, 
	Dave Chinner <david@...morbit.com>, Alexander Viro <viro@...iv.linux.org.uk>, 
	Christian Brauner <brauner@...nel.org>, Jan Kara <jack@...e.cz>, Matt Harvey <mharvey@...ptrading.com>, 
	"linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>, "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v5 2/2] fuse: add new function to invalidate cache for
 all inodes

On Mon 17-02-25 11:47:09, Luis Henriques wrote:
> On Mon, Feb 17 2025, Bernd Schubert wrote:
> > On 2/17/25 11:07, Luis Henriques wrote:
> >> On Mon, Feb 17 2025, Bernd Schubert wrote:
> >> 
> >>> On 2/16/25 17:50, Luis Henriques wrote:
> >>>> Currently userspace is able to notify the kernel to invalidate the cache
> >>>> for an inode.  This means that, if all the inodes in a filesystem need to
> >>>> be invalidated, then userspace needs to iterate through all of them and do
> >>>> this kernel notification separately.
> >>>>
> >>>> This patch adds a new option that allows userspace to invalidate all the
> >>>> inodes with a single notification operation.  In addition to invalidate
> >>>> all the inodes, it also shrinks the sb dcache.
> >>>>
> >>>> Signed-off-by: Luis Henriques <luis@...lia.com>
> >>>> ---
> >>>>  fs/fuse/inode.c           | 33 +++++++++++++++++++++++++++++++++
> >>>>  include/uapi/linux/fuse.h |  3 +++
> >>>>  2 files changed, 36 insertions(+)
> >>>>
> >>>> diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c
> >>>> index e9db2cb8c150..01a4dc5677ae 100644
> >>>> --- a/fs/fuse/inode.c
> >>>> +++ b/fs/fuse/inode.c
> >>>> @@ -547,6 +547,36 @@ struct inode *fuse_ilookup(struct fuse_conn *fc, u64 nodeid,
> >>>>  	return NULL;
> >>>>  }
> >>>>  
> >>>> +static int fuse_reverse_inval_all(struct fuse_conn *fc)
> >>>> +{
> >>>> +	struct fuse_mount *fm;
> >>>> +	struct inode *inode;
> >>>> +
> >>>> +	inode = fuse_ilookup(fc, FUSE_ROOT_ID, &fm);
> >>>> +	if (!inode || !fm)
> >>>> +		return -ENOENT;
> >>>> +
> >>>> +	/* Remove all possible active references to cached inodes */
> >>>> +	shrink_dcache_sb(fm->sb);
> >>>> +
> >>>> +	/* Remove all unreferenced inodes from cache */
> >>>> +	invalidate_inodes(fm->sb);
> >>>> +
> >>>> +	return 0;
> >>>> +}
> >>>> +
> >>>> +/*
> >>>> + * Notify to invalidate inodes cache.  It can be called with @nodeid set to
> >>>> + * either:
> >>>> + *
> >>>> + * - An inode number - Any pending writebacks within the rage [@offset @len]
> >>>> + *   will be triggered and the inode will be validated.  To invalidate the whole
> >>>> + *   cache @offset has to be set to '0' and @len needs to be <= '0'; if @offset
> >>>> + *   is negative, only the inode attributes are invalidated.
> >>>> + *
> >>>> + * - FUSE_INVAL_ALL_INODES - All the inodes in the superblock are invalidated
> >>>> + *   and the whole dcache is shrinked.
> >>>> + */
> >>>>  int fuse_reverse_inval_inode(struct fuse_conn *fc, u64 nodeid,
> >>>>  			     loff_t offset, loff_t len)
> >>>>  {
> >>>> @@ -555,6 +585,9 @@ int fuse_reverse_inval_inode(struct fuse_conn *fc, u64 nodeid,
> >>>>  	pgoff_t pg_start;
> >>>>  	pgoff_t pg_end;
> >>>>  
> >>>> +	if (nodeid == FUSE_INVAL_ALL_INODES)
> >>>> +		return fuse_reverse_inval_all(fc);
> >>>> +
> >>>>  	inode = fuse_ilookup(fc, nodeid, NULL);
> >>>>  	if (!inode)
> >>>>  		return -ENOENT;
> >>>> diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h
> >>>> index 5e0eb41d967e..e5852b63f99f 100644
> >>>> --- a/include/uapi/linux/fuse.h
> >>>> +++ b/include/uapi/linux/fuse.h
> >>>> @@ -669,6 +669,9 @@ enum fuse_notify_code {
> >>>>  	FUSE_NOTIFY_CODE_MAX,
> >>>>  };
> >>>>  
> >>>> +/* The nodeid to request to invalidate all inodes */
> >>>> +#define FUSE_INVAL_ALL_INODES 0
> >>>> +
> >>>>  /* The read buffer is required to be at least 8k, but may be much larger */
> >>>>  #define FUSE_MIN_READ_BUFFER 8192
> >>>>  
> >>>
> >>>
> >>> I think this version might end up in 
> >>>
> >>> static void fuse_evict_inode(struct inode *inode)
> >>> {
> >>> 	struct fuse_inode *fi = get_fuse_inode(inode);
> >>>
> >>> 	/* Will write inode on close/munmap and in all other dirtiers */
> >>> 	WARN_ON(inode->i_state & I_DIRTY_INODE);
> >>>
> >>>
> >>> if the fuse connection has writeback cache enabled.
> >>>
> >>>
> >>> Without having it tested, reproducer would probably be to run
> >>> something like passthrough_hp (without --direct-io), opening
> >>> and writing to a file and then sending FUSE_INVAL_ALL_INODES.
> >> 
> >> Thanks, Bernd.  So far I couldn't trigger this warning.  But I just found
> >> that there's a stupid bug in the code: a missing iput() after doing the
> >> fuse_ilookup().
> >> 
> >> I'll spend some more time trying to understand how (or if) the warning you
> >> mentioned can triggered before sending a new revision.
> >> 
> >
> > Maybe I'm wrong, but it calls 
> >
> >    invalidate_inodes()
> >       dispose_list()
> >         evict(inode)
> >            fuse_evict_inode()
> >
> > and if at the same time something writes to inode page cache, the
> > warning would be triggered? 
> > There are some conditions in evict, like inode_wait_for_writeback()
> > that might protect us, but what is if it waited and then just
> > in the right time the another write comes and dirties the inode
> > again?
> 
> Right, I have looked into that too but my understanding is that this can
> not happen because, before doing that wait, the code does:
> 
> 	inode_sb_list_del(inode);
> 
> and the inode state will include I_FREEING.
> 
> Thus, before writing to it again, the inode will need to get added back to
> the sb list.  Also, reading the comments on evict(), if something writes
> into the inode at that point that's likely a bug.  But this is just my
> understanding, and I may be missing something.

Yes. invalidate_inodes() checks i_count == 0 and sets I_FREEING. Once
I_FREEING is set nobody can acquire inode reference until the inode is
fully destroyed. So nobody should be writing to the inode or anything like
that.

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ