linux-kernel - Re: kvm causing memory corruption? now 2.6.26-rc4

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <48469BDA.3050206@qumranet.com>
Date:	Wed, 04 Jun 2008 16:42:50 +0300
From:	Avi Kivity <avi@...ranet.com>
To:	Dave Hansen <dave@...ux.vnet.ibm.com>
CC:	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	kvm-devel <kvm@...r.kernel.org>,
	"Anthony N. Liguori [imap]" <aliguori@...ibm.com>
Subject: Re: kvm causing memory corruption?  now 2.6.26-rc4

Dave Hansen wrote:
> On Thu, 2008-03-27 at 16:59 +0200, Avi Kivity wrote:
>   
>> Dave Hansen wrote:
>>     
>>> On Thu, 2008-03-27 at 12:10 +0200, Avi Kivity wrote:
>>>       
>>>> btw, is this with >= 4GB RAM on the host?
>>>>         
>>> Well, are you asking whether I have PAE on or not? :)  
>>>       
>> No, I'm asking whether there is a possibility of address truncation :)
>>
>> PAE by itself doesn't affect kvm much, as it always runs the guest in 
>> pae mode.
>>
>> Can you try running with mem=2000M or something?
>>     
>
> I have a few more data points on this.  Sorry for the massive delay from
> the last report -- I'm being a crappy bug reporter.  But, this is on my
> one and only laptop which makes it a serious pain to diagnose.  I also
> didn't have a hardware serial console on it before, which I do now.
> This is all on 2.6.26-rc4-01549-g1beee8d.
>
> Adding the mem= does not help at all.  But, it is all a bit more
> diagnosable now than a month or two ago.  I turned on all of the kernel
> debugging that I could get my grubby little hands on.  It now oopses
> quite consistently when kvm runs instead of after.  Here's a collection
> of oopses that I captured after setting up a serial line:
>
> 	http://sr71.net/~dave/kvm-oops1.txt
>
> After collecting all those, I turned on CONFIG_DEBUG_HIGHMEM and the
> oopses miraculously stopped.  But, the guest hung (for at least 5
> minutes or so) during windows bootup, pegging my host CPU.  Most of the
> CPU was going to klogd, so I checked dmesg.
>
>   

Can you check with mem=900 (and CONFIG_HIGHMEM_DEBUG=n)?  That will 
confirm that the problems are highmem related, but not physical address 
truncation related.

> I was seeing messages like this
>
> [  428.918108] kvm_handle_exit: unexpected, valid vectoring info and exit reason is 0x9
>
> And quite a few of them, like 100,000/sec.  That's why klogd was pegging
> the CPU.  Any idea on a next debugging step?
>
>   

That's a task switch.  Newer kvms handle them.


-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/