lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250204203059.GA909029@mit.edu>
Date: Tue, 4 Feb 2025 15:30:59 -0500
From: "Theodore Ts'o" <tytso@....edu>
To: Mateusz Guzik <mjguzik@...il.com>
Cc: Kees Cook <kees@...nel.org>,
        syzbot <syzbot+48a99e426f29859818c0@...kaller.appspotmail.com>,
        akpm@...ux-foundation.org, brauner@...nel.org, gustavoars@...nel.org,
        linux-hardening@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org, syzkaller-bugs@...glegroups.com
Subject: Re: [syzbot] [hardening?] [mm?] BUG: bad usercopy in vfs_readlink

On Tue, Feb 04, 2025 at 05:49:48PM +0100, Mateusz Guzik wrote:
> I'm going to restate: the original behavior can be restored by
> replacing i_size usage with a strlen call. However, as is I have no
> basis to think that the disparity between the two is legitimate. If an
> ext4 person (Ted cc'ed) tells me the disparity can legally happen and
> is the expected way, I'm going to patch it up no problem.

There are two kinds of symlinks in ext4.  "Fast symlinks", where the
symlink target can fit in i_block[].  And normal, sometimes called
"slow" symlinks, where i_block[] contains a pointer to data block
(either i_blocks[0] for a traditional inode, or the root of an extent
tree that will fit in i_block[] if EXTENTS_FL is set).  In the case
of a fast symlink, the maximum size of the file system is
sizeof(i_block[])-1.  For a normal/slow symlink, the maximum size is
super->s_blocksize.

We determine whether a symlink is "fast" or not by checking whether
i_size is less than sizeof(i_blocks[]).  In other words. if i_size
less than 60 (and it's not an encrypted inode), it's a fast symlink
and we set i_link to point to the i_block array.  From __ext4_iget()
in fs/ext4/inode.c:

		if (IS_ENCRYPTED(inode)) {
			inode->i_op = &ext4_encrypted_symlink_inode_operations;
		} else if (ext4_inode_is_fast_symlink(inode)) {
			inode->i_link = (char *)ei->i_data;
			inode->i_op = &ext4_fast_symlink_inode_operations;
			nd_terminate_link(ei->i_data, inode->i_size,
				sizeof(ei->i_data) - 1);
		} else {
			inode->i_op = &ext4_symlink_inode_operations;
		}


The call to nd_terminate_link() guarantees that inode->i_link is
NUL-terminated, although the on-disk formatshould have already been
NUL-terminated when the symlink is set.

Also note that there really is no point to caching inodes where
inode->i_link is not a NUL pointer, since in those cases we never call
file system's get_link() function; the VFS will just fetch it straight
from i_link.  And in those cases, it's because it's a "fast symlink"
(many file systems have this concept; not just ext[234]) and the
symlink target was read in as part of the on-disk inode, and so there
is no separate disk block read read the symlink.  And so if you are
attempting to cache symlinks where inode->i_link is non-NULL, you're
just wasting a small amount of memory, and wasting the CPU time to do
the strcpy.

It's also true that the vast majority of symlink targets are less than
60 bytes, which is why I was surprised that your symlink caching was
atually making a difference or many systems.  I guess if we have lots
of unicode file names with characters like ❤ and ❤️ and symlinks to
these files, you might have a bunch of slow symlinks.  But in the case
where you have symlinks like:

0 lrwxrwxrwx 1 root root 7 Dec 19  2020 /bin -> usr/bin/

Symlink caching won't do any good.

Cheers,

						- Ted

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ