lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 23 Mar 2009 19:04:29 +0300
From:	Michael Tokarev <mjt@....msk.ru>
To:	Ingo Molnar <mingo@...e.hu>
CC:	Avi Kivity <avi@...hat.com>, John Stultz <johnstul@...ibm.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Linux-kernel <linux-kernel@...r.kernel.org>,
	KVM list <kvm@...r.kernel.org>
Subject: Re: phenom, amd780g, tsc, hpet, kvm, kernel -- who's at fault?

Ingo Molnar wrote:
> * Michael Tokarev <mjt@....msk.ru> wrote:
> 
>> Now, after quite some googling around, I tried to disable hpet, 
>> booting with hpet=disable parameter.  And that one fixed all the 
>> problems at once. 7 days uptime, I stress-tested it several times, 
>> it works with TSC as timesource (still a problem within guests as 
>> those shows unstable TSC anyway) since boot, no issues logged.  
>> Even cpufreq works as expected...
[]
> It could again go bad like it did before - those messages are signs 
> of HPET weirdnesses.
> 
> Probably your box's hpet needs to be blacklisted, so that it gets 
> disabled automatically on bootup.

Well, I'm not convinced at all... at least not yet ;)

The reason is simple: this box was rock solid a few months back.
With 2.6.25 and 2.6.26 kernels I think.  It had probs with kvm
(bugs), and lacked in general hardware support (both the chipset
and phenom cpu were still too new to be fully supported).  At
that time I installed the thing (was a test install with a random
hdd, so I added real drives and installed real distro), with quite
a lot of data copying back and forth (were rearranging partitions,
raid arrays, guests and so on, copying data to another disk, to
another machine and back).  There was no single issue, no single
mention of tsc or hpet instabilities, and system time was stable
too.  But since some time, -- unfortunately I don't know when
exactly, and sure thing it'd be very interesting to know, I'll
try to figure it out -- first it started showing system clock
weirdness, and finally come to this Friday the 13 incident.

That all to say: it was stable with earlier kernel.  Now it's not.
Maybe, just maybe, at that time hpet wasn't supported, or maybe
wasn't used, or supported not in full to rely on it - I've no
idea.  If that's the case, I'll just shut up now because the
whole point becomes moot.

Maybe it was due to somehow broken bios -- I did several bios
updates there, mostly because linux complained about something
scary (something akin "wasting so much megs memory due to bios
not set up something (GART? IOMMU?)") and I was hoping to fix that.
And it will be fixed someday in bios...

(By the way: how bad the lack of hpet is?  It's used for
something, and having it malfunctioning and disabled does
not sound good, esp. on a machine which is running close
to its maximum...  Maybe I should return the mobo back? :)

Thanks!

/mjt
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ