lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250825-entbinden-kehle-2e1f8b67b190@brauner>
Date: Mon, 25 Aug 2025 11:07:55 +0200
From: Christian Brauner <brauner@...nel.org>
To: Josef Bacik <josef@...icpanda.com>
Cc: linux-fsdevel@...r.kernel.org, linux-btrfs@...r.kernel.org, 
	kernel-team@...com, linux-ext4@...r.kernel.org, linux-xfs@...r.kernel.org, 
	viro@...iv.linux.org.uk
Subject: Re: [PATCH 16/50] fs: change evict_inodes to use iput instead of
 evict directly

On Thu, Aug 21, 2025 at 04:18:27PM -0400, Josef Bacik wrote:
> At evict_inodes() time, we no longer have SB_ACTIVE set, so we can
> easily go through the normal iput path to clear any inodes. Update

I'm a bit lost why SB_ACTIVE is used here as a justification to call
iput(). I think it's because iput_final() would somehow add it back to
the LRU if SB_ACTIVE was still set and the filesystem somehow would
indicate it wouldn't want to drop the inode.

I'm confused where that would even happen. IOW, which filesystem would
indicate "don't drop the inode" even though it's about to vanish. But
anyway, that's probably not important because...

> dispose_list() to check how we need to free the inode, and then grab a
> full reference to the inode while we're looping through the remaining
> inodes, and simply iput them at the end.
> 
> Since we're just calling iput we don't really care about the i_count on
> the inode at the current time.  Remove the i_count checks and just call
> iput on every inode we find.
> 
> Signed-off-by: Josef Bacik <josef@...icpanda.com>
> ---
>  fs/inode.c | 26 +++++++++++---------------
>  1 file changed, 11 insertions(+), 15 deletions(-)
> 
> diff --git a/fs/inode.c b/fs/inode.c
> index 72981b890ec6..80ad327746a7 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -933,7 +933,7 @@ static void evict(struct inode *inode)
>   * Dispose-list gets a local list with local inodes in it, so it doesn't
>   * need to worry about list corruption and SMP locks.
>   */
> -static void dispose_list(struct list_head *head)
> +static void dispose_list(struct list_head *head, bool for_lru)
>  {
>  	while (!list_empty(head)) {
>  		struct inode *inode;
> @@ -941,8 +941,12 @@ static void dispose_list(struct list_head *head)
>  		inode = list_first_entry(head, struct inode, i_lru);
>  		list_del_init(&inode->i_lru);
>  
> -		evict(inode);
> -		iobj_put(inode);
> +		if (for_lru) {
> +			evict(inode);
> +			iobj_put(inode);
> +		} else {
> +			iput(inode);
> +		}

... Afaict, if we end up in dispose_list() we came from one of two
locations:

(1) prune_icache_sb()
    In which case inode_lru_isolate() will have only returned inodes
    that prior to your changes would have inode->i_count zero.

(2) evict_inodes()
    Similar story, this only hits inodes with inode->i_count zero.

With your change you're adding an increment from zero for (2) via
__iget() so that you always end up with a full refcount, and that is
backing your changes to dispose_list() later.

I don't see the same done for (1) though and so your later call to
iput() drops the reference below zero? It's accidently benign because
iiuc atomic_dec_and_test() will simply tell you that reference count
didn't go to zero and so iput() will back off. But still this should be
fixed if I'm right.

The conversion to iput() is introducing a lot of subtlety in the middle
of the series. If I'm right then the iput() is a always a nop because in
all cases it was an increment from zero. But it isn't really a nop
because we still do stuff like call ->drop_inode() again. Maybe it's
fine because no filesystem would have issues with this but I wouldn't
count on it and also it feels rather unclean to do it this way.

So, under the assumption, that after the increment from zero you did, we
really only have a blatant zombie inode on our hands and we only need to
get rid of the i_count we took make that explicit and do:

	if (for_lru) {
		evict(inode);
		iobj_put(inode);
	} else {
		/* This inode was always incremented from zero.
		 * Get rid of that reference without doing anything else.
		 */
		WARN_ON_ONCE(!atomic_dec_and_test(&inode->i_count));
	}

Btw, for the iobj_put() above, I assume that we're not guaranteed that
i_obj_count == 1?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ