[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAO8a2ShMmC6kRTiVmRG54xb7N8-cQe45ecJYMhOcECjPy9wvTg@mail.gmail.com>
Date: Tue, 5 Aug 2025 16:28:28 +0400
From: Alex Markuze <amarkuze@...hat.com>
To: Viacheslav Dubeyko <Slava.Dubeyko@....com>
Cc: "ceph-devel@...r.kernel.org" <ceph-devel@...r.kernel.org>, "idryomov@...il.com" <idryomov@...il.com>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 2/2] ceph: fix client race condition where r_parent
becomes stale before sending message
That's the BUG, we are fixing here. If a mismatch happens the update
if not fixed can cause a mess. A full WARN may be an overkill I agree.
On Tue, Aug 5, 2025 at 1:58 AM Viacheslav Dubeyko <Slava.Dubeyko@....com> wrote:
>
> On Mon, 2025-08-04 at 09:59 +0000, Alex Markuze wrote:
> > When the parent directory's i_rwsem is not locked, req->r_parent may become
> > stale due to concurrent operations (e.g. rename) between dentry lookup and
> > message creation. Validate that r_parent matches the encoded parent inode
> > and update to the correct inode if a mismatch is detected.
> > ---
> > fs/ceph/inode.c | 52 +++++++++++++++++++++++++++++++++++++++++++++++--
> > 1 file changed, 50 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c
> > index 814f9e9656a0..7da648b5e901 100644
> > --- a/fs/ceph/inode.c
> > +++ b/fs/ceph/inode.c
> > @@ -56,6 +56,51 @@ static int ceph_set_ino_cb(struct inode *inode, void *data)
> > return 0;
> > }
> >
> > +/*
> > + * Check if the parent inode matches the vino from directory reply info
> > + */
> > +static inline bool ceph_vino_matches_parent(struct inode *parent, struct ceph_vino vino)
> > +{
> > + return ceph_ino(parent) == vino.ino && ceph_snap(parent) == vino.snap;
> > +}
> > +
> > +/*
> > + * Validate that the directory inode referenced by @req->r_parent matches the
> > + * inode number and snapshot id contained in the reply's directory record. If
> > + * they do not match – which can theoretically happen if the parent dentry was
> > + * moved between the time the request was issued and the reply arrived – fall
> > + * back to looking up the correct inode in the inode cache.
> > + *
> > + * A reference is *always* returned. Callers that receive a different inode
> > + * than the original @parent are responsible for dropping the extra reference
> > + * once the reply has been processed.
> > + */
> > +static struct inode *ceph_get_reply_dir(struct super_block *sb,
> > + struct inode *parent,
> > + struct ceph_mds_reply_info_parsed *rinfo)
> > +{
> > + struct ceph_vino vino;
> > +
> > + if (unlikely(!rinfo->diri.in))
> > + return parent; /* nothing to compare against */
> > +
> > + /* If we didn't have a cached parent inode to begin with, just bail out. */
> > + if (!parent)
> > + return NULL;
> > +
> > + vino.ino = le64_to_cpu(rinfo->diri.in->ino);
> > + vino.snap = le64_to_cpu(rinfo->diri.in->snapid);
> > +
> > + if (likely(ceph_vino_matches_parent(parent, vino)))
> > + return parent; /* matches – use the original reference */
> > +
> > + /* Mismatch – this should be rare. Emit a WARN and obtain the correct inode. */
> > + WARN(1, "ceph: reply dir mismatch (parent valid %llx.%llx reply %llx.%llx)\n",
> > + ceph_ino(parent), ceph_snap(parent), vino.ino, vino.snap);
>
> I am not completely sure that I follow why we would like to use namely WARN()
> here? If we have some condition, then WARN() looks like natural choice.
> Otherwise, if we have unconditional situation, then, maybe, pr_warn() will be
> better? Would we like to show call trace here?
>
> Are we really sure that this mismatch could be the rare case? Otherwise, call
> traces from multiple threads will create the real mess in the system log.
>
> Thanks,
> Slava.
>
> > +
> > + return ceph_get_inode(sb, vino, NULL);
> > +}
> > +
> > /**
> > * ceph_new_inode - allocate a new inode in advance of an expected create
> > * @dir: parent directory for new inode
> > @@ -1548,8 +1593,11 @@ int ceph_fill_trace(struct super_block *sb, struct ceph_mds_request *req)
> > }
> >
> > if (rinfo->head->is_dentry) {
> > - struct inode *dir = req->r_parent;
> > -
> > + /*
> > + * r_parent may be stale, in cases when R_PARENT_LOCKED is not set,
> > + * so we need to get the correct inode
> > + */
> > + struct inode *dir = ceph_get_reply_dir(sb, req->r_parent, rinfo);
> > if (dir) {
> > err = ceph_fill_inode(dir, NULL, &rinfo->diri,
> > rinfo->dirfrag, session, -1,
>
> --
> Viacheslav Dubeyko <Slava.Dubeyko@....com>
Powered by blists - more mailing lists