[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGudoHGnE2=+EqnwjkSD48VvGpK8MVAvYXQeTxA1o=PfOqHQnA@mail.gmail.com>
Date: Sat, 19 Aug 2023 22:03:40 +0200
From: Mateusz Guzik <mjguzik@...il.com>
To: Christian Brauner <brauner@...nel.org>
Cc: Kees Cook <keescook@...omium.org>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
syzbot <syzbot+6ec38f7a8db3b3fb1002@...kaller.appspotmail.com>,
anton@...era.com, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
linux-ntfs-dev@...ts.sourceforge.net,
syzkaller-bugs@...glegroups.com, Al Viro <viro@...iv.linux.org.uk>,
"Theodore Ts'o" <tytso@....edu>
Subject: Re: [syzbot] [ntfs?] WARNING in do_open_execat
On 8/19/23, Christian Brauner <brauner@...nel.org> wrote:
> On Fri, Aug 18, 2023 at 09:12:39PM +0200, Mateusz Guzik wrote:
>> On Fri, Aug 18, 2023 at 10:33:26AM -0700, Kees Cook wrote:
>> > This is a double-check I left in place, since it shouldn't have been
>> > reachable:
>> >
>> > /*
>> > * may_open() has already checked for this, so it should be
>> > * impossible to trip now. But we need to be extra cautious
>> > * and check again at the very end too.
>> > */
>> > err = -EACCES;
>> > if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode) ||
>> > path_noexec(&file->f_path)))
>> > goto exit;
>> >
>>
>> As I mentioned in my other e-mail, the check is racy -- an unlucky
>> enough remounting with noexec should trip over it, and probably a chmod
>> too.
>>
>> However, that's not what triggers the warn in this case.
>>
>> The ntfs image used here is intentionally corrupted and the inode at
>> hand has a mode of 777 (as in type not specified).
>>
>> Then the type check in may_open():
>> switch (inode->i_mode & S_IFMT) {
>>
>> fails to match anything.
>>
>> This debug printk:
>> diff --git a/fs/namei.c b/fs/namei.c
>> index e56ff39a79bc..05652e8a1069 100644
>> --- a/fs/namei.c
>> +++ b/fs/namei.c
>> @@ -3259,6 +3259,10 @@ static int may_open(struct mnt_idmap *idmap, const
>> struct path *path,
>> if ((acc_mode & MAY_EXEC) && path_noexec(path))
>> return -EACCES;
>> break;
>> + default:
>> + /* bogus mode! */
>> + printk(KERN_EMERG "got bogus mode inode!\n");
>> + return -EACCES;
>> }
>>
>> error = inode_permission(idmap, inode, MAY_OPEN | acc_mode);
>>
>> catches it.
>>
>> All that said, I think adding a WARN_ONCE here is prudent, but I
>> don't know if denying literally all opts is the way to go.
>>
>> Do other filesystems have provisions to prevent inodes like this from
>> getting here?
>
> Bugs reported against the VFS from ntfs/ntfs3 are to be treated with
> extreme caution. Frankly, if it isn't reproducible without a corrupted
> ntfs/ntfs3 image it is to be dismissed until further notice.
>
> In this case it simply seems that ntfs is failing at ensuring that its
> own inodes it reads from disk have a well-defined type.
>
> If ntfs fails to validate that its own inodes it puts into the icache
> are correctly initialized then the vfs doesn't need to try and taper
> over this.
>
> If ntfs fails at that, there's no guarantee that it doesn't also fail at
> setting the correct i_ops for that inode. At which point we can check
> the type in may_open() but we already used bogus i_ops the whole time on
> some other inodes.
>
> We're not here to make up for silly bugs like this. That WARN belongs
> into ntfs not the vfs.
>
Given the triggered WARN_ON it seemed to me this would be the
operating procedure, I am happy it is not ;)
Per your description and the one provided by Theodore I take it
filesystems must not ship botched inodes like this one.
While in this case this is a clear-cut bug in ntfs, I would argue the
entire ordeal exposes a deficiency in VFS -- it should have a
debug-only mechanism which catches cases like this early on. For
example there could be a mandatory function to call when the
filesystem claims it constructed the inode to assert a bunch on it --
it would not catch all possible problems, but would definitely catch
this one (and VFS would have to detect the call was not made).
Perhaps I should write a separate e-mail about this, but I'm surprised
there is no debug-only (as in not present in production kernels)
support for asserting the state. To give one example which makes me
itchy see inode destruction. There are few checks in clear_inode, but
past that there is almost nothing.
Similarly there are quite a few comments how the caller is required to
hold a given lock, which should have been converted to lockdep asserts
years ago.
I'm going to write something up later.
--
Mateusz Guzik <mjguzik gmail.com>
Powered by blists - more mailing lists