lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 20 Sep 2023 15:03:43 +0200
From:   Jan Kara <jack@...e.cz>
To:     Christian Brauner <brauner@...nel.org>
Cc:     Jan Kara <jack@...e.cz>, Jeff Layton <jlayton@...nel.org>,
        Bruno Haible <bruno@...sp.org>,
        Xi Ruoyao <xry111@...uxfromscratch.org>, bug-gnulib@....org,
        Alexander Viro <viro@...iv.linux.org.uk>,
        Eric Van Hensbergen <ericvh@...nel.org>,
        Latchesar Ionkov <lucho@...kov.net>,
        Dominique Martinet <asmadeus@...ewreck.org>,
        Christian Schoenebeck <linux_oss@...debyte.com>,
        David Howells <dhowells@...hat.com>,
        Marc Dionne <marc.dionne@...istor.com>,
        Chris Mason <clm@...com>, Josef Bacik <josef@...icpanda.com>,
        David Sterba <dsterba@...e.com>, Xiubo Li <xiubli@...hat.com>,
        Ilya Dryomov <idryomov@...il.com>,
        Jan Harkes <jaharkes@...cmu.edu>, coda@...cmu.edu,
        Tyler Hicks <code@...icks.com>, Gao Xiang <xiang@...nel.org>,
        Chao Yu <chao@...nel.org>, Yue Hu <huyue2@...lpad.com>,
        Jeffle Xu <jefflexu@...ux.alibaba.com>,
        Namjae Jeon <linkinjeon@...nel.org>,
        Sungjong Seo <sj1557.seo@...sung.com>,
        Jan Kara <jack@...e.com>, Theodore Ts'o <tytso@....edu>,
        Andreas Dilger <adilger.kernel@...ger.ca>,
        Jaegeuk Kim <jaegeuk@...nel.org>,
        OGAWA Hirofumi <hirofumi@...l.parknet.co.jp>,
        Miklos Szeredi <miklos@...redi.hu>,
        Bo b Peterson <rpeterso@...hat.com>,
        Andreas Gruenbacher <agruenba@...hat.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Tejun Heo <tj@...nel.org>,
        Trond Myklebust <trond.myklebust@...merspace.com>,
        Anna Schumaker <anna@...nel.org>,
        Konstantin Komarov <almaz.alexandrovich@...agon-software.com>,
        Mark Fasheh <mark@...heh.com>,
        Joel Becker <jlbec@...lplan.org>,
        Joseph Qi <joseph.qi@...ux.alibaba.com>,
        Mike Marshall <hubcap@...ibond.com>,
        Martin Brandenburg <martin@...ibond.com>,
        Luis Chamberlain <mcgrof@...nel.org>,
        Kees Cook <keescook@...omium.org>,
        Iurii Zaikin <yzaikin@...gle.com>,
        Steve French <sfrench@...ba.org>,
        Paulo Alcantara <pc@...guebit.com>,
        Ronnie Sahlberg <ronniesahlberg@...il.com>,
        Shyam Prasad N <sprasad@...rosoft.com>,
        Tom Talpey <tom@...pey.com>,
        Sergey Senozhatsky <senozhatsky@...omium.org>,
        Richard Weinberger <richard@....at>,
        Hans de Goede <hdegoede@...hat.com>,
        Hugh Dickins <hughd@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Amir Goldstein <l@...il.com>,
        "Darrick J. Wong" <djwong@...nel.org>,
        Benjamin Coddington <bcodding@...hat.com>,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        v9fs@...ts.linux.dev, linux-afs@...ts.infradead.org,
        linux-btrfs@...r.kernel.org, ceph-devel@...r.kernel.org,
        codalist@...a.cs.cmu.edu, ecryptfs@...r.kernel.org,
        linux-erofs@...ts.ozlabs.org, linux-ext4@...r.kernel.org,
        linux-f2fs-devel@...ts.sourceforge.net, cluster-devel@...hat.com,
        linux-nfs@...r.kernel.org, ntfs3@...ts.linux.dev,
        ocfs2-devel@...ts.linux.dev, devel@...ts.orangefs.org,
        linux-cifs@...r.kernel.org, samba-technical@...ts.samba.org,
        linux-mtd@...ts.infradead.org, linux-mm@...ck.org,
        linux-unionfs@...r.kernel.org, linux-xfs@...r.kernel.org
Subject: Re: [PATCH v7 12/13] ext4: switch to multigrain timestamps

On Wed 20-09-23 12:30:52, Christian Brauner wrote:
> On Wed, Sep 20, 2023 at 12:17:31PM +0200, Jan Kara wrote:
> > On Wed 20-09-23 10:41:30, Christian Brauner wrote:
> > > > > f1 was last written to *after* f2 was last written to. If the timestamp of f1
> > > > > is then lower than the timestamp of f2, timestamps are fundamentally broken.
> > > > > 
> > > > > Many things in user-space depend on timestamps, such as build system
> > > > > centered around 'make', but also 'find ... -newer ...'.
> > > > > 
> > > > 
> > > > 
> > > > What does breakage with make look like in this situation? The "fuzz"
> > > > here is going to be on the order of a jiffy. The typical case for make
> > > > timestamp comparisons is comparing source files vs. a build target. If
> > > > those are being written nearly simultaneously, then that could be an
> > > > issue, but is that a typical behavior? It seems like it would be hard to
> > > > rely on that anyway, esp. given filesystems like NFS that can do lazy
> > > > writeback.
> > > > 
> > > > One of the operating principles with this series is that timestamps can
> > > > be of varying granularity between different files. Note that Linux
> > > > already violates this assumption when you're working across filesystems
> > > > of different types.
> > > > 
> > > > As to potential fixes if this is a real problem:
> > > > 
> > > > I don't really want to put this behind a mount or mkfs option (a'la
> > > > relatime, etc.), but that is one possibility.
> > > > 
> > > > I wonder if it would be feasible to just advance the coarse-grained
> > > > current_time whenever we end up updating a ctime with a fine-grained
> > > > timestamp? It might produce some inode write amplification. Files that
> > > 
> > > Less than ideal imho.
> > > 
> > > If this risks breaking existing workloads by enabling it unconditionally
> > > and there isn't a clear way to detect and handle these situations
> > > without risk of regression then we should move this behind a mount
> > > option.
> > > 
> > > So how about the following:
> > > 
> > > From cb14add421967f6e374eb77c36cc4a0526b10d17 Mon Sep 17 00:00:00 2001
> > > From: Christian Brauner <brauner@...nel.org>
> > > Date: Wed, 20 Sep 2023 10:00:08 +0200
> > > Subject: [PATCH] vfs: move multi-grain timestamps behind a mount option
> > > 
> > > While we initially thought we can do this unconditionally it turns out
> > > that this might break existing workloads that rely on timestamps in very
> > > specific ways and we always knew this was a possibility. Move
> > > multi-grain timestamps behind a vfs mount option.
> > > 
> > > Signed-off-by: Christian Brauner <brauner@...nel.org>
> > 
> > Surely this is a safe choice as it moves the responsibility to the sysadmin
> > and the cases where finegrained timestamps are required. But I kind of
> > wonder how is the sysadmin going to decide whether mgtime is safe for his
> > system or not? Because the possible breakage needn't be obvious at the
> > first sight... If I were a sysadmin, I'd rather opt for something like
> 
> I think you'll basically enable this because you want to export a
> filesystem via NFS.

OK, that's what I thought but then you have to make a tough choice between:

1) Possibly inconsistent NFS caches on frequent changes.
2) Possibly broken builds on NFS.

Pick your poison ;)

> > finegrained timestamps + lazytime (if I needed the finegrained timestamps
> > functionality). That should avoid the IO overhead of finegrained timestamps
> 
> That would work with this patch, no? Or are you saying it would need
> something else?

Sorry, I was not really precise here. What I meant was that instead of
having multigrain timestamps, I (as a sysadmin) would want the filesystem
to set sb->s_time_gran to 1 ns and use lazytime to remove the IO overhead
of the frequent timestamp updates. But that is just me brainstorming
possible solutions of the original NFS problem.

> > as well and I'd know I can have problems with timestamps only after a
> > system crash.
> > 
> > I've just got another idea how we could solve the problem: Couldn't we
> > always just report coarsegrained timestamp to userspace and provide access
> > to finegrained value only to NFS which should know what it's doing?
> 
> What would changes would be involved for that?

See my other email. It should be fairly small...

> If this is invasive work and we decide this is something that we want to
> do then we should remove FS_MGTIME from btrfs, xfs, ext4, and tmpfs for
> v6.6.

.. but let's see what Jeff thinks. I can miss some problem with the
solution.

								Honza
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ