[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.1505112225170.28239@cobra.newdream.net>
Date: Tue, 12 May 2015 14:35:52 -0700 (PDT)
From: Sage Weil <sage@...dream.net>
To: Kevin Easton <kevin@...rana.org>
cc: Theodore Ts'o <tytso@....edu>,
Trond Myklebust <trond.myklebust@...marydata.com>,
Dave Chinner <david@...morbit.com>,
Zach Brown <zab@...hat.com>,
Alexander Viro <viro@...iv.linux.org.uk>,
Linux FS-devel Mailing List <linux-fsdevel@...r.kernel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Linux API Mailing List <linux-api@...r.kernel.org>
Subject: Re: [PATCH RFC] vfs: add a O_NOMTIME flag
On Tue, 12 May 2015, Kevin Easton wrote:
> On Mon, May 11, 2015 at 07:10:21PM -0400, Theodore Ts'o wrote:
> > On Mon, May 11, 2015 at 09:24:09AM -0700, Sage Weil wrote:
> > > > Let me re-ask the question that I asked last week (and was apparently
> > > > ignored). Why not trying to use the lazytime feature instead of
> > > > pointing a head straight at the application's --- and system
> > > > administrators' --- heads?
> > >
> > > Sorry Ted, I thought I responded already.
> > >
> > > The goal is to avoid inode writeout entirely when we can, and
> > > as I understand it lazytime will still force writeout before the inode
> > > is dropped from the cache. In systems like Ceph in particular, the
> > > IOs can be spread across lots of files, so simply deferring writeout
> > > doesn't always help.
> >
> > Sure, but it would reduce the writeout by orders of magnitude. I can
> > understand if you want to reduce it further, but it might be good
> > enough for your purposes.
> >
> > I considered doing the equivalent of O_NOMTIME for our purposes at
> > $WORK, and our use case is actually not that different from Ceph's
> > (i.e., using a local disk file system to support a cluster file
> > system), and lazytime was (a) something I figured was something I
> > could upstream in good conscience, and (b) was more than good enough
> > for us.
>
> A safer alternative might be a chattr file attribute that if set, the
> mtime is not updated on writes, and stat() on the file always shows the
> mtime as "right now". At least that way, the file won't accidentally
> get left out of backups that rely on the mtime.
>
> (If the file attribute is unset, you immediately update the mtime then
> too, and from then on the file is back to normal).
Interesting! I didn't realize there was already a chattr +A that disabled
atime (although I suspect it doesn't do the "right now" for stat thing).
This makes the nomtime-ness a bit more obscure (I don't think most users
would think to check these file attributes), but it's a safer failure
condition for backups at least.
The fact that chattr +A (and hopefully +M) will work for non-root is a
bonus, as we're also trying to get ceph daemons to drop most privileges.
sage
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists