lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <3c1737210708230408i7a8049a9m5db49e6c4d89ab62@mail.gmail.com>
Date:	Thu, 23 Aug 2007 13:08:27 +0200
From:	"Michael Smith" <msmith@...h.org>
To:	linux-kernel@...r.kernel.org
Cc:	"Andy Wingo" <wingo@...endo.com>
Subject: gettimeofday() jumping into the future

Hi,

We've been seeing some strange behaviour on some of our applications
recently. I've tracked this down to gettimeofday() returning spurious
values occasionally.

Specifically, gettimeofday() will suddenly, for a single call, return
a value about 4398 seconds (~1 hour 13 minutes) in the future. The
following call goes back to a normal value.

This seems to be occurring when the clock source goes slightly
backwards for a single call. In
kernel/time/timekeeping.c:__get_nsec_offset(), we have this:
 cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;

So a small decrease in time here will (this is all unsigned
arithmetic) give us a very large cycle_delta. cyc2ns() then multiplies
this by some value, then right shifts by 22. The resulting value (in
nanoseconds) is approximately 4398 seconds; this gets added on to the
xtime value, giving us our jump into the future. The next call to
gettimeofday() returns to normal as we don't have this huge nanosecond
offset.

This system is a 2-socket core 2 quad machine (8 cpus), running 32 bit
mode. It's a dell poweredge 1950. The kernel selects the TSC as the
clock source, having determined that the tsc runs synchronously on
this system. Switching the systems to use a different time source
seems to make the problem go away (which is fine for us, but we'd like
to get this fixed properly upstream).

We've also seen this behaviour with a synthetic test program (which
just runs 4 threads all calling gettimeofday() in a loop as fast as
possible and testing that it doesn't jump) on an older machine, a dell
poweredge SC1425 with two p4 hyperthreaded xeons.

Can anyone advise on what's going wrong here? I can't find much in the
way of documentation on whether the TSC is guaranteed to be
monotonically increasing on intel systems. Should the code choose not
to use the TSC? Or should the TSC reading code ensure that the
returned values are monotonic?

Is there any more information that would be useful? I'll be on a plane
for most of tomorrow, so might be a little slow responding.

Thanks,

Mike
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ