[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1187894822.6024.8.camel@localhost.localdomain>
Date: Thu, 23 Aug 2007 11:47:02 -0700
From: john stultz <johnstul@...ibm.com>
To: Michael Smith <msmith@...h.org>
Cc: Peter Zijlstra <peterz@...radead.org>,
linux-kernel@...r.kernel.org, Andy Wingo <wingo@...endo.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...e.hu>
Subject: Re: gettimeofday() jumping into the future
On Thu, 2007-08-23 at 14:20 +0200, Michael Smith wrote:
> On 8/23/07, Peter Zijlstra <peterz@...radead.org> wrote:
> > On Thu, 2007-08-23 at 13:08 +0200, Michael Smith wrote:
> > > We've been seeing some strange behaviour on some of our applications
> > > recently. I've tracked this down to gettimeofday() returning spurious
> > > values occasionally.
> > >
> > > Specifically, gettimeofday() will suddenly, for a single call, return
> > > a value about 4398 seconds (~1 hour 13 minutes) in the future. The
> > > following call goes back to a normal value.
> > >
> > > This seems to be occurring when the clock source goes slightly
> > > backwards for a single call. In
> > > kernel/time/timekeeping.c:__get_nsec_offset(), we have this:
> > > cycle_delta = (cycle_now - clock->cycle_last) & clock->mask;
> > >
> > > So a small decrease in time here will (this is all unsigned
> > > arithmetic) give us a very large cycle_delta. cyc2ns() then multiplies
> > > this by some value, then right shifts by 22. The resulting value (in
> > > nanoseconds) is approximately 4398 seconds; this gets added on to the
> > > xtime value, giving us our jump into the future. The next call to
> > > gettimeofday() returns to normal as we don't have this huge nanosecond
> > > offset.
> > >
> > > This system is a 2-socket core 2 quad machine (8 cpus), running 32 bit
> > > mode. It's a dell poweredge 1950. The kernel selects the TSC as the
> > > clock source, having determined that the tsc runs synchronously on
> > > this system. Switching the systems to use a different time source
> > > seems to make the problem go away (which is fine for us, but we'd like
> > > to get this fixed properly upstream).
Hmm. That does sound like unsycned TSCs. Normally Intel systems don't
skew unless they are NUMA systems or you're entering low power states.
We try to catch both of those cases, so I'm not sure how you box is
slipping through.
Can you run the following test to verify that the TSCs are skewed?
thanks
-john
View attachment "tsc-check.c" of type "text/x-csrc" (1008 bytes)
Powered by blists - more mailing lists