lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <rrgn345nemz5xeatbrsggnybqech74ogub47d6au45mrmgch4d@jqzorhulkvre>
Date: Wed, 27 Aug 2025 14:58:51 +0200
From: Mateusz Guzik <mjguzik@...il.com>
To: Josef Bacik <josef@...icpanda.com>
Cc: linux-fsdevel@...r.kernel.org, linux-btrfs@...r.kernel.org, 
	kernel-team@...com, linux-ext4@...r.kernel.org, linux-xfs@...r.kernel.org, 
	brauner@...nel.org, viro@...iv.linux.org.uk, amir73il@...il.com
Subject: Re: [PATCH v2 03/54] fs: rework iput logic

On Tue, Aug 26, 2025 at 11:39:03AM -0400, Josef Bacik wrote:
> Currently, if we are the last iput, and we have the I_DIRTY_TIME bit
> set, we will grab a reference on the inode again and then mark it dirty
> and then redo the put.  This is to make sure we delay the time update
> for as long as possible.
> 
> We can rework this logic to simply dec i_count if it is not 1, and if it
> is do the time update while still holding the i_count reference.
> 
> Then we can replace the atomic_dec_and_lock with locking the ->i_lock
> and doing atomic_dec_and_test, since we did the atomic_add_unless above.
> 
> Signed-off-by: Josef Bacik <josef@...icpanda.com>
> ---
>  fs/inode.c | 23 ++++++++++++++---------
>  1 file changed, 14 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/inode.c b/fs/inode.c
> index a3673e1ed157..13e80b434323 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -1911,16 +1911,21 @@ void iput(struct inode *inode)
>  	if (!inode)
>  		return;
>  	BUG_ON(inode->i_state & I_CLEAR);
> -retry:
> -	if (atomic_dec_and_lock(&inode->i_count, &inode->i_lock)) {
> -		if (inode->i_nlink && (inode->i_state & I_DIRTY_TIME)) {
> -			atomic_inc(&inode->i_count);
> -			spin_unlock(&inode->i_lock);
> -			trace_writeback_lazytime_iput(inode);
> -			mark_inode_dirty_sync(inode);
> -			goto retry;
> -		}
> +
> +	if (atomic_add_unless(&inode->i_count, -1, 1))
> +		return;
> +
> +	if (inode->i_nlink && (inode->i_state & I_DIRTY_TIME)) {
> +		trace_writeback_lazytime_iput(inode);
> +		mark_inode_dirty_sync(inode);
> +	}
> +
> +	spin_lock(&inode->i_lock);
> +	if (atomic_dec_and_test(&inode->i_count)) {
> +		/* iput_final() drops i_lock */
>  		iput_final(inode);
> +	} else {
> +		spin_unlock(&inode->i_lock);
>  	}
>  }
>  EXPORT_SYMBOL(iput);
> -- 
> 2.49.0
> 

This changes semantics though.

In the stock kernel the I_DIRTY_TIME business is guaranteed to be sorted
out before the call to iput_final().

In principle the flag may reappear after mark_inode_dirty_sync() returns
and before the retried atomic_dec_and_lock succeeds, in which case it
will get cleared again.

With your change the flag is only handled once and should it reappear
before you take the ->i_lock, it will stay there.

I agree the stock handling is pretty crap though.

Your change should test the flag again after taking the spin lock but
before messing with the refcount and if need be unlock + retry.

I would not hurt to assert in iput_final that the spin lock held and
that this flag is not set.

Here is my diff to your diff to illustrate + a cosmetic change, not even
compile-tested:

diff --git a/fs/inode.c b/fs/inode.c
index 421e248b690f..a9ae0c790b5d 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1911,7 +1911,7 @@ void iput(struct inode *inode)
 	if (!inode)
 		return;
 	BUG_ON(inode->i_state & I_CLEAR);
-
+retry:
 	if (atomic_add_unless(&inode->i_count, -1, 1))
 		return;
 
@@ -1921,12 +1921,19 @@ void iput(struct inode *inode)
 	}
 
 	spin_lock(&inode->i_lock);
+
+	if (inode->i_count == 1 && inode->i_nlink && (inode->i_state & I_DIRTY_TIME)) {
+		spin_unlock(&inode->i_lock);
+		goto retry;
+	}
+
 	if (atomic_dec_and_test(&inode->i_count)) {
-		/* iput_final() drops i_lock */
-		iput_final(inode);
-	} else {
 		spin_unlock(&inode->i_lock);
+		return;
 	}
+
+	/* iput_final() drops i_lock */
+	iput_final(inode);
 }
 EXPORT_SYMBOL(iput);
 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ