lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <46392756.2030309@vmware.com>
Date:	Wed, 02 May 2007 17:05:42 -0700
From:	Zachary Amsden <zach@...are.com>
To:	Chuck Ebbert <cebbert@...hat.com>
CC:	Marcos Pinto <markybob@...il.com>, Andi Kleen <ak@...e.de>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Alessandro Zummo <a.zummo@...ertech.it>
Subject: Re: Mysterious RTC hangs on x86_64 - fixed, sort of

Chuck Ebbert wrote:

Well, turns out this is a heisenbug.  Which is good, since it means the 
nop patch didn't change anything.

> Try leaving the spinlocks and just disabling the callbacks. And maybe
> enable spinlock debugging...
>   

I tried removing all the spinlocks inside the interrupt handler.  Seemed 
to work fine for a while, but still hung (at worst, it looks missing 
locks means we might screw up and read / write the wrong CMOS register, 
not hang or crash).

So I took down 2nd CPU with hotplug (did not yet try UP kernel though).  
It took a longer time, but still hung.  Seems not to be a spinlock 
problem, but I'll turn on debugging anyway.

>   
>> CONFIG_HPET_EMULATE_RTC=y
>>     
>
> Did you try without that?
>   

Will do.  That looks much more suspicious like.  I thought I killed it 
already, but had only got this:

# CONFIG_HPET_RTC_IRQ is not set

If that still crashes, I'll try running cmos access in a loop in userspace to see if maybe the port I/O is tickling a chipset bug (the only other report I know of is on same chipset, nVidia MCP51).  Maybe SMM handler is accessing CMOS or something wacked out.  <laughs hysterical... stops laughing when the theory actually sounds plausible>.  Stuck in SMM is not good for CPU thermal throttling ... hopefully Turion's don't reach nuclear emission point.

Would also explain maybe why NMI watchdog doesn't seem to notice anything wrong.


Thanks,
Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ