lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20121017064319.GA17754@localhost>
Date:	Wed, 17 Oct 2012 14:43:19 +0800
From:	Fengguang Wu <fengguang.wu@...el.com>
To:	Xiao Guangrong <xiaoguangrong@...ux.vnet.ibm.com>
Cc:	Avi Kivity <avi@...hat.com>,
	"kvm@...r.kernel.org" <kvm@...r.kernel.org>,
	LKML <linux-kernel@...r.kernel.org>
Subject: Re: [3.5.0 BUG] vmx_handle_exit: unexpected, valid vectoring info
 (0x80000b0e)

On Wed, Oct 17, 2012 at 02:26:22PM +0800, Xiao Guangrong wrote:
> On 09/14/2012 01:57 PM, Xiao Guangrong wrote:
> > On 09/12/2012 04:15 PM, Avi Kivity wrote:
> >> On 09/12/2012 07:40 AM, Fengguang Wu wrote:
> >>> Hi,
> >>>
> >>> 3 of my test boxes running v3.5 kernel become unaccessible and I find
> >>> two of them kept emitting this dmesg:
> >>>
> >>> vmx_handle_exit: unexpected, valid vectoring info (0x80000b0e) and exit reason is 0x31
> >>>
> >>> The other one has froze and the above lines are the last dmesg.
> >>> Any ideas?
> >>
> >> First, that printk should be rate-limited.
> >>
> >> Second, we should add EXIT_REASON_EPT_MISCONFIG (0x31) to 
> >>
> >> 	if ((vectoring_info & VECTORING_INFO_VALID_MASK) &&
> >> 			(exit_reason != EXIT_REASON_EXCEPTION_NMI &&
> >> 			exit_reason != EXIT_REASON_EPT_VIOLATION &&
> >> 			exit_reason != EXIT_REASON_TASK_SWITCH))
> >> 		printk(KERN_WARNING "%s: unexpected, valid vectoring info "
> >> 		       "(0x%x) and exit reason is 0x%x\n",
> >> 		       __func__, vectoring_info, exit_reason);
> >>
> >> since it's easily caused by the guest.
> > 
> > Yes, i will do these.
> > 
> >>
> >> Third, it's really unexpected.  It seems the guest was attempting to deliver a page fault exception (0x0e) but encountered an mmio page during delivery (in the IDT, TSS, stack, or page tables).  Is this reproducible?  If so it's easy to patch kvm to halt in that case and allow examining the guest via qemu.
> >>
> > 
> > Have no idea yet why the box was frozen under this case, will try to write a test case,
> > hope it can help me to find the reason out.
> > 
> 
> Still did not know why linux kernel triggered it. I have posted
> a patchset to report an internal error for this case, hoping
> Fengguang can reproduce it after the patchset and Qemu's dump
> can help us to find the reason out.
> 
> I will keep working on it.

Thanks! Shall I run some patched kernel, or just 3.6.0?

Another problem I sometimes run into is, dmesg no longer works in the
test boxes that run lots of KVMs. It aborts with an error message:

dmesg: klogctl failed: Bad address

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ