lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20100819225300.GD9275@fieldses.org>
Date:	Thu, 19 Aug 2010 18:53:01 -0400
From:	"J. Bruce Fields" <bfields@...ldses.org>
To:	john stultz <johnstul@...ibm.com>
Cc:	"Patrick J. LoPresti" <lopresti@...il.com>,
	Alan Cox <alan@...rguk.ukuu.org.uk>,
	Andi Kleen <andi@...stfloor.org>,
	linux-fsdevel@...r.kernel.org, linux-nfs@...r.kernel.org,
	linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: Proposal: Use hi-res clock for file timestamps

On Wed, Aug 18, 2010 at 08:17:14PM -0700, john stultz wrote:
> On Wed, 2010-08-18 at 22:31 -0400, J. Bruce Fields wrote:
> > On Wed, Aug 18, 2010 at 06:41:02PM -0700, john stultz wrote:
> > > On Wed, Aug 18, 2010 at 11:12 AM, J. Bruce Fields <bfields@...ldses.org> wrote:
> > > > I'm completely ignorant about higher-resolution time sources.  Any
> > > > recommended reading?  What resolution do they actually provide, what's
> > > > the expense of reading them, how reliable are they, and how do the
> > > > answers to those questions vary across different hardware and kernel
> > > > versions?  A quick look at drivers/clocksource/ doesn't suggest
> > > > simple answers.
> > > 
> > > Yea, there aren't simple answers. Clocksource hardware varies
> > > drastically in resolution and access time across systems and
> > > architectures. Further, clocksources may change while the system is
> > > up, so we don't really expose the hardware resolution.
> > > 
> > > On x86, access latency varies from ~50ns (TSC) to ~1.3us (ACPI PM).
> > > (And that is ignoring the PIT, which can be 18us per call - luckily
> > > almost no hardware uses that). The resolution similarly scales from
> > > sub-ns (TSC @ > 1ghz cpus) to ~279ns (ACPI PM). Of course, across
> > > architectures you will see even more variance.
> > 
> > The race in question occurs when you manage to check mtime between two
> > file data updates, with all three operations occurring within a clock
> > tick.
> > 
> > No idea if that's feasible in hundreds of nanoseconds.
> 
> I think this is what Andi meant that you'll always race with time and
> that version counters are the only real solution here.

Yeah.  That'll work for NFSv4.  But if possible it'd be nice to have a
solution for NFSv3.

As compared to using a higher-resolution time source, a solution for
mtime based on a global counter would provide better guarantees (on
filesystems that can store the extra bits), and perform better.  (What
is the worst-case latency if we're bouncing a cache line back and forth
between two CPU's?)  Though I guess the possible performance hit would
rule it out for users that didn't specifically ask for it.  (So, no help
for userspace nfs servers, make, or whoever else might (wisely or not)
already depend on mtime detecting changes reliably.)

> > I'm also not sure how to judge the access latency.  Certainly a
> > microsecond is a lot compared to just reading a cached mtime value.
> > 
> > Will we ever see them go backwards?  (So if I know I wrote to file B
> > after writing to file A, is there ever a case where I could end up with
> > an earlier mtime on B than A?)
> 
> You should not. However, there have been bugs in the past, and there
> will probably be a few more in the future.
> 
> There are also theoretical issues with SMP systems where the TSCs are
> not perfectly synced, but the window for those races should be small
> (ie: smaller then can be detected - otherwise we'll throw out the TSC).

Got it.  Thanks for your help!

--b.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ