lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1269888291.3968.5.camel@localhost.localdomain>
Date:	Mon, 29 Mar 2010 11:44:51 -0700
From:	john stultz <johnstul@...ibm.com>
To:	Yury Polyanskiy <ypolyans@...nceton.edu>
Cc:	Joel Becker <Joel.Becker@...cle.com>, linux-kernel@...r.kernel.org,
	Andrew Morton <akpm@...l.org>,
	Jan Glauber <jan.glauber@...ibm.com>
Subject: Re: [PATCH] hangcheck-timer is broken on x86

On Mon, 2010-03-29 at 13:04 -0400, Yury Polyanskiy wrote:
> On Mon, 29 Mar 2010 09:43:27 -0700
> john stultz <johnstul@...ibm.com> wrote:
> 
> > > I am not sure which archs do you mean. But in any case,
> > > getrawmonotonic() is not just a wrap around a call to rdtsc() (or acpi
> > > pm timer read). It is based on the clock->raw_time, which is updated
> > > every timer interrupt by the update_wall_time(). So even if underlying
> > > timer wraps, it doesn't lead to getrawmonotonic() returning 0 sec.  
> > 
> > What I'm saying is that if you're using getrawmonotonic() to detect
> > hangs, you might miss them, as getrawmonotonic may wrap (and thus stop
> > continually increasing) if the timer interrupt is delayed. This does not
> > apply to systems using the TSC clocksource, but does apply to systems
> > using the acpi_pm. 
> 
> But if timer interrupt is delayed by more than acpi_pm wrap-around
> time, then the update_wall_time() is also screwed. Since it is not, we
> can rely on getrawmonotonic().

Right, if the box hangs for longer then the clocksource can count for,
the timekeeping subsystem will be off by some multiple of that length.

And That's exactly why I'm advising against using
gettimeofday/getrawmonotonic or any other software managed sense of time
for the hangcheck timer, as you won't be able to correctly detect hangs.

I'm also suggesting using something like read_persistent_clock() is
better, because there is no OS/software management involved (other then
the minor syncing issue I mentioned before) so if the system hangs for a
long period of time, then returns, you'll still be able to detect the
hang.

But maybe what folks are using the hangcheck timer for is shifting, so
its possible that I'm not quite understanding what you're trying to do
here.

thanks
-john

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ