linux-kernel - Re: Proposal: Use hi-res clock for file timestamps

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20100819094136.24fef59b@notabene>
Date:	Thu, 19 Aug 2010 09:41:36 +1000
From:	Neil Brown <neilb@...e.de>
To:	Chuck Lever <chuck.lever@...cle.com>
Cc:	"J. Bruce Fields" <bfields@...ldses.org>,
	Alan Cox <alan@...rguk.ukuu.org.uk>,
	"Patrick J. LoPresti" <lopresti@...il.com>,
	Andi Kleen <andi@...stfloor.org>,
	linux-fsdevel@...r.kernel.org, linux-nfs@...r.kernel.org,
	linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: Proposal: Use hi-res clock for file timestamps

On Wed, 18 Aug 2010 14:15:51 -0400
Chuck Lever <chuck.lever@...cle.com> wrote:

> 
> On Aug 18, 2010, at 1:32 PM, J. Bruce Fields wrote:
> 
> > On Wed, Aug 18, 2010 at 03:53:59PM +1000, Neil Brown wrote:
> >> I'm not sure you even want to pay for a per-filesystem atomic access when
> >> updating mtime.  mnt_want_write - called at the same time - seems to go to
> >> some lengths to avoid an atomic operation.
> >> 
> >> I think that nfsd should be the only place that has to pay the atomic
> >> penalty, as it is where the need is.
> >> 
> >> I imagine something like this:
> >> - Create a global struct timespec which is protected by a seqlock
> >>   Call it current_nfsd_time or similar.
> >> - file_update_time reads this and uses it if it is newer than
> >>   current_fs_time.
> >> - nfsd updates it whenever it reads an mtime out of an inode that matches
> >>   current_fs_time to the granularity of 1/HZ.
> > 
> > We can also skip the update whenever current_nfsd_time is greater than
> > the inode's mtime--that's enough to ensure that the next
> > file_update_time() call will get a time different from the inode's
> > current mtime.
> 
> Would it help if we only did this for directories, for now?
> 
> Files have close-to-open.  Directories... don't.  So we have the problem where directory changes (ie file creation and deletion) takes a long time (some times an infinitely long time) to propagate to clients.  Plus: directories don't change very often, so using fine-grained time stamps only on directories wouldn't impact heavy I/O workloads.

I'm don't quite see how close-to-open really affects this issue - it still
relies on the timestamps and so can cache old data if a file update didn't
change the timestamp.

In my mind the difference is that near-concurrent access to files usually
involves file locking which flushes caches (and if it doesn't then you have
bigger problems) while near-concurrent access to directories relies on the
natural atomicity of dir operations so no locking or flushing occurs.

So I agree that this is probably more of an issue for directories than for
files, and that implementing it just for directories would be a sensible
first step with lower expected overhead - just my reasoning seems to be a bit
different.

Thanks,
NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/