[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGudoHF7jPw347kqcDW2mFLzcJcYqiFLsbFtd-ngYG=vWUz8xQ@mail.gmail.com>
Date: Wed, 10 Dec 2025 11:09:24 +0100
From: Mateusz Guzik <mjguzik@...il.com>
To: Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
Cc: Al Viro <viro@...iv.linux.org.uk>,
syzbot <syzbot+d222f4b7129379c3d5bc@...kaller.appspotmail.com>, brauner@...nel.org,
jack@...e.cz, jlbec@...lplan.org, joseph.qi@...ux.alibaba.com,
linkinjeon@...nel.org, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, mark@...heh.com, ocfs2-devel@...ts.linux.dev,
sj1557.seo@...sung.com, syzkaller-bugs@...glegroups.com,
Chuck Lever <chuck.lever@...cle.com>
Subject: Re: [PATCH for 6.19-rc1] fs: preserve file type in make_bad_inode()
unless invalid
On Wed, Dec 10, 2025 at 10:45 AM Tetsuo Handa
<penguin-kernel@...ove.sakura.ne.jp> wrote:
>
> syzbot is hitting VFS_BUG_ON_INODE(!S_ISDIR(inode->i_mode)) check
> introduced by commit e631df89cd5d ("fs: speed up path lookup with cheaper
> handling of MAY_EXEC"), for make_bad_inode() is blindly changing file type
> to S_IFREG. Since make_bad_inode() might be called after an inode is fully
> constructed, make_bad_inode() should not needlessly change file type.
>
ouch
So let's say calls to make_bad_inode *after* d_instantiate are unavoidable.
While screwing around with inode type for bogus inodes has merit,
switching things up from under dentry is bogus in its own right and
the choice of ISREG is questionable at best -- as is the call
introduces internally inconsistent state, in the case of the original
report a dentry claiming it's a directory pointing to a regular inode.
The non-screwed way of handling this would introduce a known BAD inode
type and patch up dentries as needed as well, but that might be a lot
of churn for not that much benefit.
As is, I don't know if leaving the type unchanged is safe -- there
might be nasty cases which depend on the change.
At the same time I claim the assert I introduced is mandatory.
Absent thorough audit, I think the pragmatic way forward for the time
being is to give code a chance to assert correctness and adjust the
assert in lookup code accordingly.
Right now that's not possible since the reassignment of type happens
without the inode spinlock held. Since make_bad_inode() calls into
remove_inode_hash which takes the spinlock, it should be safe to also
take it around the reassignment.
Then code wishing to assert on type race-free can take the spinlock to
stabilize it.
> Reported-by: syzbot+d222f4b7129379c3d5bc@...kaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=d222f4b7129379c3d5bc
> Signed-off-by: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
> ---
> Should we implement all callbacks (except get_offset_ctx callback which is
> currently used by only tmpfs which does not call make_bad_inode()) within
> bad_inode_ops, for there might be a callback which is expected to be non-NULL
> for !S_IFREG types? Implementing missing callbacks is good for eliminating
> possibility of NULL function pointer call. Since VFS is using
>
> if (!inode->i_op->foo)
> return error;
> inode->i_op->foo();
>
> pattern instead of
>
> pFoo = READ_ONCE(inode->i_op->foo)
> if (!pFoo)
> return error;
> pFoo();
>
> pattern, suddenly replacing "one i_op with i_op->foo != NULL" with "another
> i_op with i_op->foo == NULL" has possibility of NULL pointer function call
> (e.g. https://lkml.kernel.org/r/18a58415-4aa9-4cba-97d2-b70384407313@I-love.SAKURA.ne.jp ).
> If we implement missing callbacks, e.g. vfs_fileattr_get() will start
> calling security_inode_file_getattr() on bad inode, but we can eliminate
> possibility of inode->i_op->fileattr_get == NULL when make_bad_inode() is
> called from security_inode_file_getattr() for some reason.
>
> fs/bad_inode.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/fs/bad_inode.c b/fs/bad_inode.c
> index 0ef9bcb744dd..ff6c2daecd1c 100644
> --- a/fs/bad_inode.c
> +++ b/fs/bad_inode.c
> @@ -207,7 +207,19 @@ void make_bad_inode(struct inode *inode)
> {
> remove_inode_hash(inode);
>
> - inode->i_mode = S_IFREG;
> + switch (inode->i_mode & S_IFMT) {
> + case S_IFREG:
> + case S_IFDIR:
> + case S_IFLNK:
> + case S_IFCHR:
> + case S_IFBLK:
> + case S_IFIFO:
> + case S_IFSOCK:
> + inode->i_mode &= S_IFMT;
> + break;
> + default:
> + inode->i_mode = S_IFREG;
> + }
> simple_inode_init_ts(inode);
> inode->i_op = &bad_inode_ops;
> inode->i_opflags &= ~IOP_XATTR;
> --
> 2.47.3
>
>
Powered by blists - more mailing lists