[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGudoHFvLiCnCkAQN82qhdqmy0P2=Ywtnwqkc3KOc3s3hUQX8g@mail.gmail.com>
Date: Tue, 4 Feb 2025 21:48:22 +0100
From: Mateusz Guzik <mjguzik@...il.com>
To: "Theodore Ts'o" <tytso@....edu>
Cc: Kees Cook <kees@...nel.org>,
syzbot <syzbot+48a99e426f29859818c0@...kaller.appspotmail.com>,
akpm@...ux-foundation.org, brauner@...nel.org, gustavoars@...nel.org,
linux-hardening@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] [hardening?] [mm?] BUG: bad usercopy in vfs_readlink
On Tue, Feb 4, 2025 at 9:31 PM Theodore Ts'o <tytso@....edu> wrote:
>
> On Tue, Feb 04, 2025 at 05:49:48PM +0100, Mateusz Guzik wrote:
> > I'm going to restate: the original behavior can be restored by
> > replacing i_size usage with a strlen call. However, as is I have no
> > basis to think that the disparity between the two is legitimate. If an
> > ext4 person (Ted cc'ed) tells me the disparity can legally happen and
> > is the expected way, I'm going to patch it up no problem.
>
> There are two kinds of symlinks in ext4. "Fast symlinks", where the
> symlink target can fit in i_block[]. And normal, sometimes called
> "slow" symlinks, where i_block[] contains a pointer to data block
> (either i_blocks[0] for a traditional inode, or the root of an extent
> tree that will fit in i_block[] if EXTENTS_FL is set). In the case
> of a fast symlink, the maximum size of the file system is
> sizeof(i_block[])-1. For a normal/slow symlink, the maximum size is
> super->s_blocksize.
>
> We determine whether a symlink is "fast" or not by checking whether
> i_size is less than sizeof(i_blocks[]). In other words. if i_size
> less than 60 (and it's not an encrypted inode), it's a fast symlink
> and we set i_link to point to the i_block array. From __ext4_iget()
> in fs/ext4/inode.c:
>
> if (IS_ENCRYPTED(inode)) {
> inode->i_op = &ext4_encrypted_symlink_inode_operations;
> } else if (ext4_inode_is_fast_symlink(inode)) {
> inode->i_link = (char *)ei->i_data;
> inode->i_op = &ext4_fast_symlink_inode_operations;
> nd_terminate_link(ei->i_data, inode->i_size,
> sizeof(ei->i_data) - 1);
> } else {
> inode->i_op = &ext4_symlink_inode_operations;
> }
>
>
> The call to nd_terminate_link() guarantees that inode->i_link is
> NUL-terminated, although the on-disk formatshould have already been
> NUL-terminated when the symlink is set.
>
It is nul-terminated, but inode->i_size does not line up with it -- as
in the inode size pulled from the disk image is different than what
strlen would suggest.
My question is if that's legitimate, I'm guessing not. If not, then
ext4 should complain about it.
On stock kernel this happens to work because strlen finds the "right" size.
> Also note that there really is no point to caching inodes where
> inode->i_link is not a NUL pointer, since in those cases we never call
> file system's get_link() function; the VFS will just fetch it straight
> from i_link. And in those cases, it's because it's a "fast symlink"
> (many file systems have this concept; not just ext[234]) and the
> symlink target was read in as part of the on-disk inode, and so there
> is no separate disk block read read the symlink. And so if you are
> attempting to cache symlinks where inode->i_link is non-NULL, you're
> just wasting a small amount of memory, and wasting the CPU time to do
> the strcpy.
>
My code does not allocate any extra memory or do any extra copies.
The code prior to my change would grab i_link, strlen on it and pass
that to copy_to_user every single time readlink is issued.
My code stores the size in the inode (unioned with a field not used
for symlinks, so no again no increase in memory usage) so that the
strlen call can be avoided. Past that it's the same thing.
--
Mateusz Guzik <mjguzik gmail.com>
Powered by blists - more mailing lists