linux-kernel - Re: [PATCH v5 6/9] xfs: switch to multigrain timestamps

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <95a135dcec10423b9bcb9f53a1420d80b4afdba7.camel@kernel.org>
Date: Thu, 11 Jul 2024 11:58:59 -0400
From: Jeff Layton <jlayton@...nel.org>
To: "Darrick J. Wong" <djwong@...nel.org>
Cc: Alexander Viro <viro@...iv.linux.org.uk>, Christian Brauner
 <brauner@...nel.org>, Jan Kara <jack@...e.cz>, Steven Rostedt
 <rostedt@...dmis.org>, Masami Hiramatsu <mhiramat@...nel.org>, Mathieu
 Desnoyers <mathieu.desnoyers@...icios.com>, Chandan Babu R
 <chandan.babu@...cle.com>, Theodore Ts'o <tytso@....edu>, Andreas Dilger
 <adilger.kernel@...ger.ca>, Chris Mason <clm@...com>, Josef Bacik
 <josef@...icpanda.com>, David Sterba <dsterba@...e.com>,  Hugh Dickins
 <hughd@...gle.com>, Andrew Morton <akpm@...ux-foundation.org>, Jonathan
 Corbet <corbet@....net>,  Dave Chinner <david@...morbit.com>, Andi Kleen
 <ak@...ux.intel.com>, Christoph Hellwig <hch@...radead.org>,  Uros Bizjak
 <ubizjak@...il.com>, Kent Overstreet <kent.overstreet@...ux.dev>, Arnd
 Bergmann <arnd@...db.de>,  Randy Dunlap <rdunlap@...radead.org>,
 kernel-team@...com, linux-fsdevel@...r.kernel.org, 
 linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org, 
 linux-xfs@...r.kernel.org, linux-ext4@...r.kernel.org, 
 linux-btrfs@...r.kernel.org, linux-mm@...ck.org, linux-nfs@...r.kernel.org,
  linux-doc@...r.kernel.org
Subject: Re: [PATCH v5 6/9] xfs: switch to multigrain timestamps

On Thu, 2024-07-11 at 08:09 -0700, Darrick J. Wong wrote:
> On Thu, Jul 11, 2024 at 07:08:10AM -0400, Jeff Layton wrote:
> > Enable multigrain timestamps, which should ensure that there is an
> > apparent change to the timestamp whenever it has been written after
> > being actively observed via getattr.
> > 
> > Also, anytime the mtime changes, the ctime must also change, and those
> > are now the only two options for xfs_trans_ichgtime. Have that function
> > unconditionally bump the ctime, and ASSERT that XFS_ICHGTIME_CHG is
> > always set.
> > 
> > Finally, stop setting STATX_CHANGE_COOKIE in getattr, since the ctime
> > should give us better semantics now.
> 
> Following up on "As long as the fs isn't touching i_ctime_nsec directly,
> you shouldn't need to worry about this" from:
> https://lore.kernel.org/linux-xfs/cae5c28f172ac57b7eaaa98a00b23f342f01ba64.camel@kernel.org/
> 
> xfs /does/ touch i_ctime_nsec directly when it's writing inodes to disk.
> From xfs_inode_to_disk, see:
> 
> 	to->di_ctime = xfs_inode_to_disk_ts(ip, inode_get_ctime(inode));
> 
> AFAICT, inode_get_ctime itself remains unchanged, and still returns
> inode->__i_ctime, right?  In which case it's returning a raw timespec64,
> which can include the QUERIED flag in tv_nsec, right?
> 

No, in the first patch in the series, inode_get_ctime becomes this:

#define I_CTIME_QUERIED         ((u32)BIT(31))

static inline time64_t inode_get_ctime_sec(const struct inode *inode)
{
        return inode->i_ctime_sec;
}

static inline long inode_get_ctime_nsec(const struct inode *inode)
{
        return inode->i_ctime_nsec & ~I_CTIME_QUERIED;
}

static inline struct timespec64 inode_get_ctime(const struct inode *inode)
{
        struct timespec64 ts = { .tv_sec  = inode_get_ctime_sec(inode),
                                 .tv_nsec = inode_get_ctime_nsec(inode) };

        return ts;
}

...which should ensure that you never store the QUERIED bit.

> Now let's look at the consumer:
> 
> static inline xfs_timestamp_t
> xfs_inode_to_disk_ts(
> 	struct xfs_inode		*ip,
> 	const struct timespec64		tv)
> {
> 	struct xfs_legacy_timestamp	*lts;
> 	xfs_timestamp_t			ts;
> 
> 	if (xfs_inode_has_bigtime(ip))
> 		return cpu_to_be64(xfs_inode_encode_bigtime(tv));
> 
> 	lts = (struct xfs_legacy_timestamp *)&ts;
> 	lts->t_sec = cpu_to_be32(tv.tv_sec);
> 	lts->t_nsec = cpu_to_be32(tv.tv_nsec);
> 
> 	return ts;
> }
> 
> For the !bigtime case (aka before we added y2038 support) the queried
> flag gets encoded into the tv_nsec field since xfs doesn't filter the
> queried flag.
> 
> For the bigtime case, the timespec is turned into an absolute nsec count
> since the xfs epoch (which is the minimum timestamp possible under the
> old encoding scheme):
> 
> static inline uint64_t xfs_inode_encode_bigtime(struct timespec64 tv)
> {
> 	return xfs_unix_to_bigtime(tv.tv_sec) * NSEC_PER_SEC + tv.tv_nsec;
> }
> 
> Here we'd also be mixing in the QUERIED flag, only now we've encoded a
> time that's a second in the future.  I think the solution is to add a:
> 
> static inline struct timespec64
> inode_peek_ctime(const struct inode *inode)
> {
> 	return (struct timespec64){
> 		.tv_sec = inode->__i_ctime.tv_sec,
> 		.tv_nsec = inode->__i_ctime.tv_nsec & ~I_CTIME_QUERIED,
> 	};
> }
> 
> similar to what inode_peek_iversion does for iversion; and then
> xfs_inode_to_disk can do:
> 
> 	to->di_ctime = xfs_inode_to_disk_ts(ip, inode_peek_ctime(inode));
> 
> which would prevent I_CTIME_QUERIED from going out to disk.
> 
> At load time, xfs_inode_from_disk uses inode_set_ctime_to_ts so I think
> xfs won't accidentally introduce QUERIED when it's loading an inode from
> disk.
> 
> 

Also already done in this patchset:

struct timespec64 inode_set_ctime_to_ts(struct inode *inode, struct timespec64 ts)
{
        inode->i_ctime_sec = ts.tv_sec;
        inode->i_ctime_nsec = ts.tv_nsec & ~I_CTIME_QUERIED;
        trace_inode_set_ctime_to_ts(inode, &ts);
        return ts;
}
EXPORT_SYMBOL(inode_set_ctime_to_ts);

Basically, we never want to store or fetch the QUERIED flag from disk,
and since it's in an unused bit, we can just universally mask it off
when dealing with "external" users of it.

One caveat -- I am using the sign bit for the QUERIED flag, so I'm
assuming that no one should ever pass inode_set_ctime_to_ts a negative
tv_nsec value.

Maybe I should add a WARN_ON_ONCE here to check for that? It seems
nonsensical, but you never know...

> > Signed-off-by: Jeff Layton <jlayton@...nel.org>
> > ---
> >  fs/xfs/libxfs/xfs_trans_inode.c |  6 +++---
> >  fs/xfs/xfs_iops.c               | 10 +++-------
> >  fs/xfs/xfs_super.c              |  2 +-
> >  3 files changed, 7 insertions(+), 11 deletions(-)
> > 
> > diff --git a/fs/xfs/libxfs/xfs_trans_inode.c b/fs/xfs/libxfs/xfs_trans_inode.c
> > index 69fc5b981352..1f3639bbf5f0 100644
> > --- a/fs/xfs/libxfs/xfs_trans_inode.c
> > +++ b/fs/xfs/libxfs/xfs_trans_inode.c
> > @@ -62,12 +62,12 @@ xfs_trans_ichgtime(
> >  	ASSERT(tp);
> >  	xfs_assert_ilocked(ip, XFS_ILOCK_EXCL);
> >  
> > -	tv = current_time(inode);
> > +	/* If the mtime changes, then ctime must also change */
> > +	ASSERT(flags & XFS_ICHGTIME_CHG);
> >  
> > +	tv = inode_set_ctime_current(inode);
> >  	if (flags & XFS_ICHGTIME_MOD)
> >  		inode_set_mtime_to_ts(inode, tv);
> > -	if (flags & XFS_ICHGTIME_CHG)
> > -		inode_set_ctime_to_ts(inode, tv);
> >  	if (flags & XFS_ICHGTIME_CREATE)
> >  		ip->i_crtime = tv;
> >  }
> > diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> > index a00dcbc77e12..d25872f818fa 100644
> > --- a/fs/xfs/xfs_iops.c
> > +++ b/fs/xfs/xfs_iops.c
> > @@ -592,8 +592,9 @@ xfs_vn_getattr(
> >  	stat->gid = vfsgid_into_kgid(vfsgid);
> >  	stat->ino = ip->i_ino;
> >  	stat->atime = inode_get_atime(inode);
> > -	stat->mtime = inode_get_mtime(inode);
> > -	stat->ctime = inode_get_ctime(inode);
> > +
> > +	fill_mg_cmtime(stat, request_mask, inode);
> > +
> >  	stat->blocks = XFS_FSB_TO_BB(mp, ip->i_nblocks + ip->i_delayed_blks);
> >  
> >  	if (xfs_has_v3inodes(mp)) {
> > @@ -603,11 +604,6 @@ xfs_vn_getattr(
> >  		}
> >  	}
> >  
> > -	if ((request_mask & STATX_CHANGE_COOKIE) && IS_I_VERSION(inode)) {
> > -		stat->change_cookie = inode_query_iversion(inode);
> > -		stat->result_mask |= STATX_CHANGE_COOKIE;
> > -	}
> > -
> >  	/*
> >  	 * Note: If you add another clause to set an attribute flag, please
> >  	 * update attributes_mask below.
> > diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> > index 27e9f749c4c7..210481b03fdb 100644
> > --- a/fs/xfs/xfs_super.c
> > +++ b/fs/xfs/xfs_super.c
> > @@ -2052,7 +2052,7 @@ static struct file_system_type xfs_fs_type = {
> >  	.init_fs_context	= xfs_init_fs_context,
> >  	.parameters		= xfs_fs_parameters,
> >  	.kill_sb		= xfs_kill_sb,
> > -	.fs_flags		= FS_REQUIRES_DEV | FS_ALLOW_IDMAP,
> > +	.fs_flags		= FS_REQUIRES_DEV | FS_ALLOW_IDMAP | FS_MGTIME,
> >  };
> >  MODULE_ALIAS_FS("xfs");
> >  
> > 
> > -- 
> > 2.45.2
> > 

-- 
Jeff Layton <jlayton@...nel.org>