lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3d19dc03-72aa-46de-a6cc-4426cc84eb51@auristor.com>
Date: Tue, 29 Apr 2025 13:35:36 -0400
From: Jeffrey E Altman <jaltman@...istor.com>
To: David Howells <dhowells@...hat.com>,
 Alexander Viro <viro@...iv.linux.org.uk>,
 Christian Brauner <brauner@...nel.org>
Cc: Etienne Champetier <champetier.etienne@...il.com>,
 Marc Dionne <marc.dionne@...istor.com>, Chet Ramey <chet.ramey@...e.edu>,
 Steve French <sfrench@...ba.org>, linux-afs@...ts.infradead.org,
 openafs-devel@...nafs.org, linux-cifs@...r.kernel.org,
 linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] afs, bash: Fix open(O_CREAT) on an extant AFS file in a
 sticky dir

David,

Thanks for this patch.   I believe there is one mistake in 
afs_have_same_owner().

I've added a bit more background which might be incorporated into a 
future commit message.

I have also asked inline about the safety of the use of id-mapped uids 
for the ownership checks.

On 4/29/2025 12:37 PM, David Howells wrote:
>      
> Bash has a work around in redir_open() that causes open(O_CREAT) of a file
> in a sticky directory to be retried without O_CREAT if bash was built with
> AFS workarounds configured:
>
>          #if defined (AFS)
>                if ((fd < 0) && (errno == EACCES))
>              {
>                fd = open (filename, flags & ~O_CREAT, mode);
>                errno = EACCES;    /* restore errno */
>              }
>
>          #endif /* AFS */

I think its worth clarifying the purpose of this fallback logic and why 
it exists.  The fallback
logic was added to bash 1.14.7 as part of the introduction of support 
for IBM/Transarc AFS 3.4.
It was noted that sometimes EEXIST would be returned from open(filename, 
flags | O_CREAT)
but would succeed if open(filename, flags & ~O_CREAT) was called.  There 
is no evidence that
the AFS developers were aware of the problem.

I can report that the cause of this behavior is the AFS fileserver's 
checking for PRSFS_INSERT
rights on the directory prior to performing the CreateFile action.  If 
the caller is not permitted
to create a directory entry the action fails with EACCES even if the 
requested filename already
exists.  The most recent versions of IBM AFS 3.6, OpenAFS, and 
AuriStorFS fileservers all
continue to exhibit this behavior today.

This logic predated 30aba6656f61ed44cba445a3c0d38b296fa9e8f5 ("namei: 
allow restricted
O_CREAT of FIFOs and regular files") by decades.  As a side effect, when 
the filesystem is an
in-tree or out-of-tree AFS-family filesystem ...

> This works around the kernel not being able to validly check the
> current_fsuid() against i_uid on the file or the directory because the
> uidspaces of the system and of AFS may well be disjoint.  The problem lies
> with the uid checks in may_create_in_sticky().
>
> However, the bash work around is going to be removed:
>
>          https://git.savannah.gnu.org/cgit/bash.git/tree/redir.c?h=bash-5.3-rc1#n733
>
> Fix this in the kernel by:
>
>   (1) Provide an ->is_owned_by_me() inode op, similar to ->permission(),
>       and, if provided, call that to determine if the caller owns the file
>       instead of checking the i_uid to current_fsuid().
>
>   (2) Provide a ->have_same_owner() inode op, that, if provided, can be
>       called to see if an inode has the same owner as the parent on the path
>       walked.
>
> For kafs, use the first hook to check to see if the server indicated the
> ADMINISTER bit in the access rights returned by the FS.FetchStatus and
> suchlike

The AFSFetchStatus.CallerAccess field returned for a non-directory 
object includes the
PRSFS_ADMINISTER bit set if the caller is authenticated and the 
authenticated identity
is the AFS ID returned in the AFSFetchStatus.Owner field.

> and the second hook to compare the AFS user IDs retrieved by
> FS.FetchStatus (which may not fit in a kuid if AuriStor's YFS variant).

Perhaps more importantly, the AFS IDs specified in the 
AFSFetchStatus.Owner field often
have no relationship to the local system's uid namespace and might 
collide with various
uid ranges which might be used for id-mapping by container managers.

> This can be tested by creating a sticky directory (the user must have a
> token to do this) and creating a file in it.  Then strace bash doing "echo
> foo >>file" and look at whether bash does a single, successful O_CREAT open
> on the file or whether that one fails and then bash does one without
> O_CREAT that succeeds.
>
> Reported-by: Etienne Champetier <champetier.etienne@...il.com>
> Signed-off-by: David Howells <dhowells@...hat.com>
> cc: Marc Dionne <marc.dionne@...istor.com>
> cc: Jeffrey Altman <jaltman@...istor.com>
> cc: Chet Ramey <chet.ramey@...e.edu>
> cc: Alexander Viro <viro@...iv.linux.org.uk>
> cc: Christian Brauner <brauner@...nel.org>
> cc: Steve French <sfrench@...ba.org>
> cc: linux-afs@...ts.infradead.org
> cc: openafs-devel@...nafs.org
> cc: linux-cifs@...r.kernel.org
> cc: linux-fsdevel@...r.kernel.org
> ---
>   fs/afs/dir.c       |    2 ++
>   fs/afs/file.c      |    2 ++
>   fs/afs/internal.h  |    3 +++
>   fs/afs/security.c  |   52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>   fs/namei.c         |   22 ++++++++++++++++++----
>   include/linux/fs.h |    3 +++
>   6 files changed, 80 insertions(+), 4 deletions(-)
>
> diff --git a/fs/afs/dir.c b/fs/afs/dir.c
> index 9e7b1fe82c27..6360db1673b0 100644
> --- a/fs/afs/dir.c
> +++ b/fs/afs/dir.c
> @@ -65,6 +65,8 @@ const struct inode_operations afs_dir_inode_operations = {
>   	.permission	= afs_permission,
>   	.getattr	= afs_getattr,
>   	.setattr	= afs_setattr,
> +	.is_owned_by_me	= afs_is_owned_by_me,
> +	.have_same_owner = afs_have_same_owner,
>   };
>   
>   const struct address_space_operations afs_dir_aops = {
> diff --git a/fs/afs/file.c b/fs/afs/file.c
> index fc15497608c6..0317f0a36cf2 100644
> --- a/fs/afs/file.c
> +++ b/fs/afs/file.c
> @@ -47,6 +47,8 @@ const struct inode_operations afs_file_inode_operations = {
>   	.getattr	= afs_getattr,
>   	.setattr	= afs_setattr,
>   	.permission	= afs_permission,
> +	.is_owned_by_me	= afs_is_owned_by_me,
> +	.have_same_owner = afs_have_same_owner,
>   };
>   
>   const struct address_space_operations afs_file_aops = {
> diff --git a/fs/afs/internal.h b/fs/afs/internal.h
> index 440b0e731093..fbfbf615abe3 100644
> --- a/fs/afs/internal.h
> +++ b/fs/afs/internal.h
> @@ -1495,6 +1495,9 @@ extern struct key *afs_request_key(struct afs_cell *);
>   extern struct key *afs_request_key_rcu(struct afs_cell *);
>   extern int afs_check_permit(struct afs_vnode *, struct key *, afs_access_t *);
>   extern int afs_permission(struct mnt_idmap *, struct inode *, int);
> +int afs_is_owned_by_me(struct mnt_idmap *idmap, struct inode *inode);
> +int afs_have_same_owner(struct mnt_idmap *idmap, struct inode *inode,
> +			struct dentry *dentry);
>   extern void __exit afs_clean_up_permit_cache(void);
>   
>   /*
> diff --git a/fs/afs/security.c b/fs/afs/security.c
> index 6a7744c9e2a2..cc9689fbed47 100644
> --- a/fs/afs/security.c
> +++ b/fs/afs/security.c
> @@ -477,6 +477,58 @@ int afs_permission(struct mnt_idmap *idmap, struct inode *inode,
>   	return ret;
>   }
>   
> +/*
> + * Determine if an inode is owned by 'me' - whatever that means for the
> + * filesystem.  In the case of AFS, this means that the file is owned by the
> + * AFS user represented by the token (e.g. from a kerberos server) held in a
> + * key.  Returns 0 if owned by me, 1 if not; can also return an error.
> + */
> +int afs_is_owned_by_me(struct mnt_idmap *idmap, struct inode *inode)
> +{
> +	struct afs_vnode *vnode = AFS_FS_I(inode);
> +	afs_access_t access;
> +	struct key *key;
> +	int ret;
> +
> +	key = afs_request_key(vnode->volume->cell);
> +	if (IS_ERR(key))
> +		return PTR_ERR(key);
> +
> +	/* Get the access rights for the key on this file. */
> +	ret = afs_check_permit(vnode, key, &access);
> +	if (ret < 0)
> +		goto error;
> +
> +	/* We get the ADMINISTER bit if we own the file. */
> +	ret = (access & AFS_ACE_ADMINISTER) ? 0 : 1;
> +error:
> +	key_put(key);
> +	return ret;
> +}
> +
> +/*
> + * Determine if a file has the same owner as its parent - whatever that means
> + * for the filesystem.  In the case of AFS, this means comparing their AFS
> + * UIDs.  Returns 0 if same, 1 if not same; can also return an error.
> + */
> +int afs_have_same_owner(struct mnt_idmap *idmap, struct inode *inode,
> +			struct dentry *dentry)
> +{
> +	struct afs_vnode *vnode = AFS_FS_I(inode);
> +	struct dentry *parent;
> +	s64 owner;
> +
> +	/* Get the owner's ID for the directory.  Ideally, we'd use RCU to
> +	 * access the parent rather than getting a ref.
> +	 */
> +	parent = dget_parent(dentry);
> +	vnode = AFS_FS_I(d_backing_inode(parent));
This line is overwriting 'vnode' with the parent.   I think there needs 
to be a separate pvnode or something.
> +	owner = vnode->status.owner;
> +	dput(parent);
> +
> +	return vnode->status.owner != owner;
> +}
> +
>   void __exit afs_clean_up_permit_cache(void)
>   {
>   	int i;
> diff --git a/fs/namei.c b/fs/namei.c
> index 84a0e0b0111c..79e8ef1b6900 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -1318,11 +1318,25 @@ static int may_create_in_sticky(struct mnt_idmap *idmap, struct nameidata *nd,
>   
>   	i_vfsuid = i_uid_into_vfsuid(idmap, inode);

Unrelated to this change but is use of id-mapped uid values safe for 
this purpose?

Isn't it possible for more than one uid to be mapped to the same vfsuid 
value?=

> -	if (vfsuid_eq(i_vfsuid, dir_vfsuid))
> -		return 0;
> +	if (unlikely(inode->i_op->have_same_owner)) {
> +		int ret = inode->i_op->have_same_owner(idmap, inode, nd->path.dentry);
>   
> -	if (vfsuid_eq_kuid(i_vfsuid, current_fsuid()))
> -		return 0;
> +		if (ret <= 0)
> +			return ret;
> +	} else {
> +		if (vfsuid_eq(i_vfsuid, dir_vfsuid))
> +			return 0;
> +	}
> +
> +	if (unlikely(inode->i_op->is_owned_by_me)) {
> +		int ret = inode->i_op->is_owned_by_me(idmap, inode);
> +
> +		if (ret <= 0)
> +			return ret;
> +	} else {
> +		if (vfsuid_eq_kuid(i_vfsuid, current_fsuid()))
> +			return 0;
> +	}
>   
>   	if (likely(dir_mode & 0002)) {
>   		audit_log_path_denied(AUDIT_ANOM_CREAT, "sticky_create");
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 016b0fe1536e..ec278d2d362a 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2236,6 +2236,9 @@ struct inode_operations {
>   			    struct dentry *dentry, struct fileattr *fa);
>   	int (*fileattr_get)(struct dentry *dentry, struct fileattr *fa);
>   	struct offset_ctx *(*get_offset_ctx)(struct inode *inode);
> +	int (*is_owned_by_me)(struct mnt_idmap *idmap, struct inode *inode);
> +	int (*have_same_owner)(struct mnt_idmap *idmap, struct inode *inode,
> +			       struct dentry *dentry);
>   } ____cacheline_aligned;
>   
>   static inline int call_mmap(struct file *file, struct vm_area_struct *vma)


Jeffrey Altman



Download attachment "smime.p7s" of type "application/pkcs7-signature" (4276 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ