[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAN-5tyEjYRFrJ7Gc4S8KwAZUuF-uz6ovPa4-_ynt+GGVqJHN_A@mail.gmail.com>
Date: Fri, 19 Dec 2025 10:58:51 -0500
From: Olga Kornievskaia <aglo@...ch.edu>
To: Jeff Layton <jlayton@...nel.org>
Cc: Alexander Viro <viro@...iv.linux.org.uk>, Christian Brauner <brauner@...nel.org>, Jan Kara <jack@...e.cz>,
Steven Rostedt <rostedt@...dmis.org>, Masami Hiramatsu <mhiramat@...nel.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, Chuck Lever <chuck.lever@...cle.com>,
NeilBrown <neil@...wn.name>, Olga Kornievskaia <okorniev@...hat.com>, Dai Ngo <Dai.Ngo@...cle.com>,
Tom Talpey <tom@...pey.com>, Trond Myklebust <trondmy@...merspace.com>,
Anna Schumaker <anna@...nel.org>, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
linux-nfs@...r.kernel.org
Subject: Re: [PATCH v4 8/8] nfsd: freeze c/mtime updates with outstanding
WRITE_ATTRS delegation
Hi Jeff,
I narrowed down the upstream failure for generic/215 and generic/407
to this commit.
Let's consider first where the kernel is compiled with delegated
attributes off (but it also fails just the same if the delegated
attributes are compiled in).
I don't understand why the code unconditionally changed to call
nfsd4_finalize_deleg_timestamps() which I think the main driver behind
the failure.
Running generic/407 there is an OPEN (which gives out a write
delegation) and returns a change id, then on this filehandle there is
a SETATTR (with a getattr) which returns a new changeid. Then there is
a CLONE where the filehandle is the destination filehandle on which
there is a getattr which returns unchanged changeid/modify time (bad).
Then there is a DELEGRETURN (with a getattr) which again returns same
change id. Test fails.
Prior to this commit. The changeid/modify time is different in CLONE
and DELEGRETURN -- test passes.
Now let me describe what happens with delegated attributes enabled.
OPEN returns delegated attributes delegation, included getattr return
a changeid. Then CLONE is done, the included gettattr returns a
different (from open's) changeid (different time_modify). Then there
is SETATTR+GEATTR+DELEGRETURN compound from the client (which carries
a time_deleg_modify value different from above). Server in getattr
replies with changeid same as in clone and mtime with the value client
provided. So I'm not sure exactly why the test fails here but that's a
different problem as my focus is on "delegation attribute off option"
at the moment.
I don't know if this is the correct fix or not but perhaps we
shouldn't unconditionally be setting this mode? (note this fix only
fixes the delegattributes off. however i have no claims that this
patch is what broke 215/407 for delegated attributes on. Something
else is in play there). If this solution is acceptable, I can send a
patch.
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
index 81fa7cc6c77b..624cc6ab2802 100644
--- a/fs/nfsd/nfs4state.c
+++ b/fs/nfsd/nfs4state.c
@@ -6318,7 +6318,8 @@ nfs4_open_delegation(struct svc_rqst *rqstp,
struct nfsd4_open *open,
dp->dl_ctime = stat.ctime;
dp->dl_mtime = stat.mtime;
spin_lock(&f->f_lock);
- f->f_mode |= FMODE_NOCMTIME;
+ if (deleg_ts)
+ f->f_mode |= FMODE_NOCMTIME;
spin_unlock(&f->f_lock);
trace_nfsd_deleg_write(&dp->dl_stid.sc_stateid);
} else {
On Wed, Jul 30, 2025 at 9:27 AM Jeff Layton <jlayton@...nel.org> wrote:
>
> Instead of allowing the ctime to roll backward with a WRITE_ATTRS
> delegation, set FMODE_NOCMTIME on the file and have it skip mtime and
> ctime updates.
>
> It is possible that the client will never send a SETATTR to set the
> times before returning the delegation. Add two new bools to struct
> nfs4_delegation:
>
> dl_written: tracks whether the file has been written since the
> delegation was granted. This is set in the WRITE and LAYOUTCOMMIT
> handlers.
>
> dl_setattr: tracks whether the client has sent at least one valid
> mtime that can also update the ctime in a SETATTR.
>
> When unlocking the lease for the delegation, clear FMODE_NOCMTIME. If
> the file has been written, but no setattr for the delegated mtime and
> ctime has been done, update the timestamps to current_time().
>
> Suggested-by: NeilBrown <neil@...wn.name>
> Signed-off-by: Jeff Layton <jlayton@...nel.org>
> ---
> fs/nfsd/nfs4proc.c | 26 ++++++++++++++++++++++++--
> fs/nfsd/nfs4state.c | 42 ++++++++++++++++++++++++++++++++++++++++++
> fs/nfsd/state.h | 4 +++-
> 3 files changed, 69 insertions(+), 3 deletions(-)
>
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index aacd912a5fbe29ba5ccac206d13243308f36b7fa..bfebe6e25638a76d3607bb79a239bdc92e42e7b5 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -1151,7 +1151,9 @@ vet_deleg_attrs(struct nfsd4_setattr *setattr, struct nfs4_delegation *dp)
> if (setattr->sa_bmval[2] & FATTR4_WORD2_TIME_DELEG_MODIFY) {
> if (nfsd4_vet_deleg_time(&iattr->ia_mtime, &dp->dl_mtime, &now)) {
> iattr->ia_ctime = iattr->ia_mtime;
> - if (!nfsd4_vet_deleg_time(&iattr->ia_ctime, &dp->dl_ctime, &now))
> + if (nfsd4_vet_deleg_time(&iattr->ia_ctime, &dp->dl_ctime, &now))
> + dp->dl_setattr = true;
> + else
> iattr->ia_valid &= ~(ATTR_CTIME | ATTR_CTIME_SET);
> } else {
> iattr->ia_valid &= ~(ATTR_CTIME | ATTR_CTIME_SET |
> @@ -1238,12 +1240,26 @@ nfsd4_setattr(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> return status;
> }
>
> +static void nfsd4_file_mark_deleg_written(struct nfs4_file *fi)
> +{
> + spin_lock(&fi->fi_lock);
> + if (!list_empty(&fi->fi_delegations)) {
> + struct nfs4_delegation *dp = list_first_entry(&fi->fi_delegations,
> + struct nfs4_delegation, dl_perfile);
> +
> + if (dp->dl_type == OPEN_DELEGATE_WRITE_ATTRS_DELEG)
> + dp->dl_written = true;
> + }
> + spin_unlock(&fi->fi_lock);
> +}
> +
> static __be32
> nfsd4_write(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> union nfsd4_op_u *u)
> {
> struct nfsd4_write *write = &u->write;
> stateid_t *stateid = &write->wr_stateid;
> + struct nfs4_stid *stid = NULL;
> struct nfsd_file *nf = NULL;
> __be32 status = nfs_ok;
> unsigned long cnt;
> @@ -1256,10 +1272,15 @@ nfsd4_write(struct svc_rqst *rqstp, struct nfsd4_compound_state *cstate,
> trace_nfsd_write_start(rqstp, &cstate->current_fh,
> write->wr_offset, cnt);
> status = nfs4_preprocess_stateid_op(rqstp, cstate, &cstate->current_fh,
> - stateid, WR_STATE, &nf, NULL);
> + stateid, WR_STATE, &nf, &stid);
> if (status)
> return status;
>
> + if (stid) {
> + nfsd4_file_mark_deleg_written(stid->sc_file);
> + nfs4_put_stid(stid);
> + }
> +
> write->wr_how_written = write->wr_stable_how;
> status = nfsd_vfs_write(rqstp, &cstate->current_fh, nf,
> write->wr_offset, &write->wr_payload,
> @@ -2550,6 +2571,7 @@ nfsd4_layoutcommit(struct svc_rqst *rqstp,
> mutex_unlock(&ls->ls_mutex);
>
> nfserr = ops->proc_layoutcommit(inode, rqstp, lcp);
> + nfsd4_file_mark_deleg_written(ls->ls_stid.sc_file);
> nfs4_put_stid(&ls->ls_stid);
> out:
> return nfserr;
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index 205ee8cc6fa2b9f74d08f7938b323d03bdf8286c..81fa7cc6c77b3cdc5ff22bc60ab0654f95dc258d 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -1222,6 +1222,42 @@ static void put_deleg_file(struct nfs4_file *fp)
> nfs4_file_put_access(fp, NFS4_SHARE_ACCESS_READ);
> }
>
> +static void nfsd4_finalize_deleg_timestamps(struct nfs4_delegation *dp, struct file *f)
> +{
> + struct iattr ia = { .ia_valid = ATTR_ATIME | ATTR_CTIME | ATTR_MTIME };
> + struct inode *inode = file_inode(f);
> + int ret;
> +
> + /* don't do anything if FMODE_NOCMTIME isn't set */
> + if ((READ_ONCE(f->f_mode) & FMODE_NOCMTIME) == 0)
> + return;
> +
> + spin_lock(&f->f_lock);
> + f->f_mode &= ~FMODE_NOCMTIME;
> + spin_unlock(&f->f_lock);
> +
> + /* was it never written? */
> + if (!dp->dl_written)
> + return;
> +
> + /* did it get a setattr for the timestamps at some point? */
> + if (dp->dl_setattr)
> + return;
> +
> + /* Stamp everything to "now" */
> + inode_lock(inode);
> + ret = notify_change(&nop_mnt_idmap, f->f_path.dentry, &ia, NULL);
> + inode_unlock(inode);
> + if (ret) {
> + struct inode *inode = file_inode(f);
> +
> + pr_notice_ratelimited("Unable to update timestamps on inode %02x:%02x:%lu: %d\n",
> + MAJOR(inode->i_sb->s_dev),
> + MINOR(inode->i_sb->s_dev),
> + inode->i_ino, ret);
> + }
> +}
> +
> static void nfs4_unlock_deleg_lease(struct nfs4_delegation *dp)
> {
> struct nfs4_file *fp = dp->dl_stid.sc_file;
> @@ -1229,6 +1265,7 @@ static void nfs4_unlock_deleg_lease(struct nfs4_delegation *dp)
>
> WARN_ON_ONCE(!fp->fi_delegees);
>
> + nfsd4_finalize_deleg_timestamps(dp, nf->nf_file);
> kernel_setlease(nf->nf_file, F_UNLCK, NULL, (void **)&dp);
> put_deleg_file(fp);
> }
> @@ -6265,6 +6302,8 @@ nfs4_open_delegation(struct svc_rqst *rqstp, struct nfsd4_open *open,
> memcpy(&open->op_delegate_stateid, &dp->dl_stid.sc_stateid, sizeof(dp->dl_stid.sc_stateid));
>
> if (open->op_share_access & NFS4_SHARE_ACCESS_WRITE) {
> + struct file *f = dp->dl_stid.sc_file->fi_deleg_file->nf_file;
> +
> if (!nfsd4_add_rdaccess_to_wrdeleg(rqstp, open, fh, stp) ||
> !nfs4_delegation_stat(dp, currentfh, &stat)) {
> nfs4_put_stid(&dp->dl_stid);
> @@ -6278,6 +6317,9 @@ nfs4_open_delegation(struct svc_rqst *rqstp, struct nfsd4_open *open,
> dp->dl_atime = stat.atime;
> dp->dl_ctime = stat.ctime;
> dp->dl_mtime = stat.mtime;
> + spin_lock(&f->f_lock);
> + f->f_mode |= FMODE_NOCMTIME;
> + spin_unlock(&f->f_lock);
> trace_nfsd_deleg_write(&dp->dl_stid.sc_stateid);
> } else {
> open->op_delegate_type = deleg_ts && nfs4_delegation_stat(dp, currentfh, &stat) ?
> diff --git a/fs/nfsd/state.h b/fs/nfsd/state.h
> index bf9436cdb93c5dd5502ecf83433ea311e3678711..b6ac0f37e9cdfcfddde5861c8c0c51bad60101b7 100644
> --- a/fs/nfsd/state.h
> +++ b/fs/nfsd/state.h
> @@ -217,10 +217,12 @@ struct nfs4_delegation {
> struct nfs4_clnt_odstate *dl_clnt_odstate;
> time64_t dl_time;
> u32 dl_type;
> -/* For recall: */
> + /* For recall: */
> int dl_retries;
> struct nfsd4_callback dl_recall;
> bool dl_recalled;
> + bool dl_written;
> + bool dl_setattr;
>
> /* for CB_GETATTR */
> struct nfs4_cb_fattr dl_cb_fattr;
>
> --
> 2.50.1
>
>
Powered by blists - more mailing lists