lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0d4ffc45c292005a65ca244b013a313f7d35e607.camel@ibm.com>
Date: Mon, 4 Aug 2025 21:58:33 +0000
From: Viacheslav Dubeyko <Slava.Dubeyko@....com>
To: Alex Markuze <amarkuze@...hat.com>,
        "ceph-devel@...r.kernel.org"
	<ceph-devel@...r.kernel.org>
CC: "idryomov@...il.com" <idryomov@...il.com>,
        "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>
Subject: Re:  [PATCH v2 2/2] ceph: fix client race condition where r_parent
 becomes stale before sending message

On Mon, 2025-08-04 at 09:59 +0000, Alex Markuze wrote:
> When the parent directory's i_rwsem is not locked, req->r_parent may become
> stale due to concurrent operations (e.g. rename) between dentry lookup and
> message creation. Validate that r_parent matches the encoded parent inode
> and update to the correct inode if a mismatch is detected.
> ---
>  fs/ceph/inode.c | 52 +++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 50 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
> index 814f9e9656a0..7da648b5e901 100644
> --- a/fs/ceph/inode.c
> +++ b/fs/ceph/inode.c
> @@ -56,6 +56,51 @@ static int ceph_set_ino_cb(struct inode *inode, void *data)
>  	return 0;
>  }
>  
> +/*
> + * Check if the parent inode matches the vino from directory reply info
> + */
> +static inline bool ceph_vino_matches_parent(struct inode *parent, struct ceph_vino vino)
> +{
> +	return ceph_ino(parent) == vino.ino && ceph_snap(parent) == vino.snap;
> +}
> +
> +/*
> + * Validate that the directory inode referenced by @req->r_parent matches the
> + * inode number and snapshot id contained in the reply's directory record.  If
> + * they do not match – which can theoretically happen if the parent dentry was
> + * moved between the time the request was issued and the reply arrived – fall
> + * back to looking up the correct inode in the inode cache.
> + *
> + * A reference is *always* returned.  Callers that receive a different inode
> + * than the original @parent are responsible for dropping the extra reference
> + * once the reply has been processed.
> + */
> +static struct inode *ceph_get_reply_dir(struct super_block *sb,
> +                                       struct inode *parent,
> +                                       struct ceph_mds_reply_info_parsed *rinfo)
> +{
> +    struct ceph_vino vino;
> +
> +    if (unlikely(!rinfo->diri.in))
> +        return parent; /* nothing to compare against */
> +
> +    /* If we didn't have a cached parent inode to begin with, just bail out. */
> +    if (!parent)
> +        return NULL;
> +
> +    vino.ino  = le64_to_cpu(rinfo->diri.in->ino);
> +    vino.snap = le64_to_cpu(rinfo->diri.in->snapid);
> +
> +    if (likely(ceph_vino_matches_parent(parent, vino)))
> +        return parent; /* matches – use the original reference */
> +
> +    /* Mismatch – this should be rare.  Emit a WARN and obtain the correct inode. */
> +    WARN(1, "ceph: reply dir mismatch (parent valid %llx.%llx reply %llx.%llx)\n",
> +         ceph_ino(parent), ceph_snap(parent), vino.ino, vino.snap);

I am not completely sure that I follow why we would like to use namely WARN()
here? If we have some condition, then WARN() looks like natural choice.
Otherwise, if we have unconditional situation, then, maybe, pr_warn() will be
better? Would we like to show call trace here?

Are we really sure that this mismatch could be the rare case? Otherwise, call
traces from multiple threads will create the real mess in the system log.

Thanks,
Slava.

> +
> +    return ceph_get_inode(sb, vino, NULL);
> +}
> +
>  /**
>   * ceph_new_inode - allocate a new inode in advance of an expected create
>   * @dir: parent directory for new inode
> @@ -1548,8 +1593,11 @@ int ceph_fill_trace(struct super_block *sb, struct ceph_mds_request *req)
>  	}
>  
>  	if (rinfo->head->is_dentry) {
> -		struct inode *dir = req->r_parent;
> -
> +		/*
> +		 * r_parent may be stale, in cases when R_PARENT_LOCKED is not set,
> +		 * so we need to get the correct inode
> +		 */
> +		struct inode *dir = ceph_get_reply_dir(sb, req->r_parent, rinfo);
>  		if (dir) {
>  			err = ceph_fill_inode(dir, NULL, &rinfo->diri,
>  					      rinfo->dirfrag, session, -1,

-- 
Viacheslav Dubeyko <Slava.Dubeyko@....com>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ