linux-kernel - Re: [PATCH v3 2/4] fs: hide ->i_state handling behind accessors

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aMJxydmz_azN7Kce@dread.disaster.area>
Date: Thu, 11 Sep 2025 16:52:57 +1000
From: Dave Chinner <david@...morbit.com>
To: Mateusz Guzik <mjguzik@...il.com>
Cc: brauner@...nel.org, viro@...iv.linux.org.uk, jack@...e.cz,
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	josef@...icpanda.com, kernel-team@...com, amir73il@...il.com,
	linux-btrfs@...r.kernel.org, linux-ext4@...r.kernel.org,
	linux-xfs@...r.kernel.org, ocfs2-devel@...ts.linux.dev
Subject: Re: [PATCH v3 2/4] fs: hide ->i_state handling behind accessors

On Thu, Sep 11, 2025 at 06:55:55AM +0200, Mateusz Guzik wrote:
> Signed-off-by: Mateusz Guzik <mjguzik@...il.com>

So why did you choose these specific wrapper functions?

Some commentary on why you choose this specific API would be very
useful here.

> ---
>  block/bdev.c                     |   4 +-
>  drivers/dax/super.c              |   2 +-
>  fs/buffer.c                      |   4 +-
>  fs/crypto/keyring.c              |   2 +-
>  fs/crypto/keysetup.c             |   2 +-
>  fs/dcache.c                      |   8 +-
>  fs/drop_caches.c                 |   2 +-
>  fs/fs-writeback.c                | 123 ++++++++++++++++---------------
>  fs/inode.c                       | 103 +++++++++++++-------------
>  fs/libfs.c                       |   6 +-
>  fs/namei.c                       |   8 +-
>  fs/notify/fsnotify.c             |   2 +-
>  fs/pipe.c                        |   2 +-
>  fs/quota/dquot.c                 |   2 +-
>  fs/sync.c                        |   2 +-
>  include/linux/backing-dev.h      |   5 +-
>  include/linux/fs.h               |  55 +++++++++++++-
>  include/linux/writeback.h        |   4 +-
>  include/trace/events/writeback.h |   8 +-
>  mm/backing-dev.c                 |   2 +-
>  security/landlock/fs.c           |   2 +-
>  21 files changed, 198 insertions(+), 150 deletions(-)
> 
> diff --git a/block/bdev.c b/block/bdev.c
> index b77ddd12dc06..77f04042ac67 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -67,7 +67,7 @@ static void bdev_write_inode(struct block_device *bdev)
>  	int ret;
>  
>  	spin_lock(&inode->i_lock);
> -	while (inode->i_state & I_DIRTY) {
> +	while (inode_state_read(inode) & I_DIRTY) {
>  		spin_unlock(&inode->i_lock);
>  		ret = write_inode_now(inode, true);
>  		if (ret)

This isn't an improvement.

It makes the code harder to read, and now I have to go look at the
implementation of a set of helper functions to determine if that's
the right helper to use for the context the code is operating in.

What would be an improvement is making all the state flags disappear
behind the same flag APIs as other high level objects that
filesystems interface with. e.g. folio flags use
folio_test.../folio_set.../folio_clear...

Looking wider, at least XFS, ext4 and btrfs use these same
set/test/clear flag APIs for feature and mount option flags. XFS
also uses them for oeprational state in mount, journal and per-ag
structures, etc. It's a pretty common pattern.

Using it for the inode state flags would lead to code like this:

	spin_lock(&inode->i_lock);
	while (inode_test_dirty(inode)) {
	.....

That's far cleaner and easier to understand and use than an API that
explicitly encodes the locking context of the specific access being
made in the helper names.

IOWs, the above I_DIRTY flag ends up with a set of wrappers that
look like:

bool inode_test_dirty_unlocked(struct inode *inode)
{
	return inode->i_state & I_DIRTY;
}

bool inode_test_dirty(struct inode *inode)
{
	lockdep_assert_held(&inode->i_lock);
	return inode_test_dirty_unlocked(inode);
}

void inode_set_dirty(struct inode *inode)
{
	lockdep_assert_held(&inode->i_lock);
	inode->i_state |= I_DIRTY;
}

void inode_clear_dirty(struct inode *inode)
{
	lockdep_assert_held(&inode->i_lock);
	inode->i_state &= ~I_DIRTY;
}

With this, almost no callers need to know about the I_DIRTY flag -
direct use of it is a red flag and/or an exceptional case.  It's
self documenting that it is an exceptional case, and it better have
a comment explaining why it is safe....

This also gives us the necessary lockdep checks to ensure the right
locks are held when modifications are being made.

And best of all, the wrappers can be generated by macros; they don't
need to be directly coded and maintained.

Yes, we have compound state checks, but like page-flags.h we can
manually implement those few special cases such as this one:

> @@ -1265,7 +1265,7 @@ void sync_bdevs(bool wait)
>  		struct block_device *bdev;
>  
>  		spin_lock(&inode->i_lock);
> -		if (inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW) ||
> +		if (inode_state_read(inode) & (I_FREEING|I_WILL_FREE|I_NEW) ||

-		if (inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW) ||
+		if (inode_test_new_or_freeing(inode)) ||

bool inode_test_new_or_freeing(struct inode *inode)
{
	lockdep_assert_held(&inode->i_lock);
	return inode->i_state & (I_FREEING | I_WILL_FREE | I_NEW);
}

Or if we want to avoid directly using flags in these wrappers,
we write them like this:

bool inode_test_new_or_freeing(struct inode *inode)
{
	return inode_test_freeing(inode) ||
		inode_test_will_free(inode) ||
		inode_test_new(inode);
}

Writing the compound wrappers this way then allows future
improvements such as changing the state flags to atomic bitops so
we can remove all the dependencies on holding inode->i_lock to
manipulate state flags safely.

Hence I think moving the state flags behind an API similar to folio
state flags makes the code easier to read and use correctly, whilst
also providing the checking that the correct locks are held at the
correct times. It also makes it  easier to further improve the
implementation in future because all the users of the API are
completely isolated from the implmentation....

-Dave.
-- 
Dave Chinner
david@...morbit.com