lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <534A8101.1050803@intel.com>
Date:	Sun, 13 Apr 2014 20:20:17 +0800
From:	Jet Chen <jet.chen@...el.com>
To:	Borislav Petkov <bp@...en8.de>
CC:	"H. Peter Anvin" <hpa@...or.com>,
	"Romer, Benjamin M" <Benjamin.Romer@...sys.com>,
	Fengguang Wu <fengguang.wu@...el.com>,
	Paolo Bonzini <pbonzini@...hat.com>,
	LKML <linux-kernel@...r.kernel.org>, qemu-devel@...gnu.org
Subject: Re: [visorchipset] invalid opcode: 0000 [#1] PREEMPT SMP

Thanks Borislav.
As I never test this issue on the latest version of qemu, qemu guys may want to reproduce it on their side.

Although every reproduce detail can be found in this mail thread, I would like to give a summary here.

- kernel code base:
	
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

commit 12e364b9f08aa335dc7716ce74113e834c993765
Author:     Ken Cox <jkc@...hat.com>
AuthorDate: Tue Mar 4 07:58:07 2014 -0600
Commit:     Greg Kroah-Hartman <gregkh@...uxfoundation.org>
CommitDate: Tue Mar 4 16:58:21 2014 -0800

    staging: visorchipset driver to provide registration and other services


- reproduce script (original one in the first message of this thread can't manage to reproduce. should remove -enable-kvm option):
------------------------------------------------------------
#!/bin/bash

kernel=$1

kvm=(
	qemu-system-x86_64 -cpu kvm64
	-kernel $kernel
	-smp 2
	-m 256M
	-net nic,vlan=0,macaddr=00:00:00:00:00:00,model=virtio
	-net user,vlan=0
	-net nic,vlan=1,model=e1000
	-net user,vlan=1
	-boot order=nc
	-no-reboot
	-watchdog i6300esb
	-serial stdio
	-display none
	-monitor null
)

append=(
	debug
	sched_debug
	apic=debug
	ignore_loglevel
	sysrq_always_enabled
	panic=10
	prompt_ramdisk=0
	earlyprintk=ttyS0,115200
	console=ttyS0,115200
	console=tty0
	vga=normal
	root=/dev/ram0
	rw
)

"${kvm[@]}" --append "${append[*]}"
------------------------------------------------------------------

- dmesg log:

[   24.135101] FPGA image file name: xlinx_fpga_firmware.bit
[   24.137595] GPIO INIT FAIL!!
[   24.141283] driver version 1.0.0.0 loaded
[   24.142539] chipset driver version 1.0.0.0 loadedinvalid opcode: 0000 [#1] PREEMPT \
SMP  [   24.144793] Modules linked in:
[   24.145303] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.14.0-rc5-00621-g12e364b #1
[   24.145303] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[   24.145303] task: ffff88001157a010 ti: ffff88001157c000 task.ti: ffff88001157c000
[   24.145303] RIP: 0010:[<ffffffff81e37115>]  [<ffffffff81e37115>] \
visorchipset_init+0x7b/0x8c5 [   24.145303] RSP: 0000:ffff88001157de58  EFLAGS: \
00000286 [   24.145303] RAX: 000000000000070b RBX: 0000000000000004 RCX: \
4000000000000000 [   24.145303] RDX: a70aba7500000000 RSI: ffff88001157de5c RDI: \
ffff88001157de58 [   24.145303] RBP: ffff88001157de90 R08: 0000000000000002 R09: \
ffff88001157de60 [   24.145303] R10: ffff88001157de64 R11: 0000000000000000 R12: \
ffff88001157de5c [   24.145303] R13: ffff88001157de60 R14: ffff88001157de64 R15: \
0000000000000000 [   24.145303] FS:  0000000000000000(0000) GS:ffff880012600000(0000) \
knlGS:0000000000000000 [   24.145303] CS:  0010 DS: 0000 ES: 0000 CR0: \
000000008005003b [   24.145303] CR2: ffff880002992000 CR3: 0000000001c07000 CR4: \
00000000003006f0 [   24.145303] DR0: 0000000000000000 DR1: 0000000000000000 DR2: \
0000000000000000 [   24.145303] DR3: 0000000000000000 DR6: 0000000000000000 DR7: \
0000000000000000 [   24.145303] Stack:
[   24.145303]  00000800000306c1 078bfbf982d82203 ffffffff81e3709a 0000000000000000
[   24.145303]  00000000000001df 0000000000000000 0000000000000000 ffff88001157df08
[   24.145303]  ffffffff810002b2 ffffffff810b2600 ffff88001157df08 ffffffff810b27db
[   24.145303] Call Trace:
[   24.145303]  [<ffffffff81e3709a>] ? visorchannel_init+0x1d/0x1d
[   24.145303]  [<ffffffff810002b2>] do_one_initcall+0x8e/0x138
[   24.145303]  [<ffffffff810b2600>] ? param_array_set+0xef/0xf5
[   24.145303]  [<ffffffff810b27db>] ? parse_args+0x180/0x248
[   24.145303]  [<ffffffff81dfbf86>] kernel_init_freeable+0x108/0x199
[   24.145303]  [<ffffffff81dfb73a>] ? do_early_param+0x8a/0x8a
[   24.145303]  [<ffffffff8173f08e>] ? rest_init+0xc2/0xc2
[   24.145303]  [<ffffffff8173f097>] kernel_init+0x9/0xda
[   24.145303]  [<ffffffff8176024c>] ret_from_fork+0x7c/0xb0
[   24.145303]  [<ffffffff8173f08e>] ? rest_init+0xc2/0xc2
[   24.145303] Code: 8d 65 cc 4c 8d 6d d0 4c 8d 75 d4 79 21 48 ba 00 00 00 00 75 ba \
0a a7 48 b9 00 00 00 00 00 00 00 40 bb 04 00 00 00 b8 0b 07 00 00 <0f> 01 c1 8b 35 c2 \
c4 b4 00 48 c7 c7 f5 93 b4 81 31 c0 e8 3b 21  [   24.145303] RIP  \
[<ffffffff81e37115>] visorchipset_init+0x7b/0x8c5 [   24.145303]  RSP \
<ffff88001157de58> [   24.187247] ---[ end trace 62b5721899a66a6c ]---
[   24.188157] Kernel panic - not syncing: Fatal exception


kernel kconfig & full dmesg log please check attachment in this mail.

Thanks,
Jet


On 04/13/2014 07:51 PM, Borislav Petkov wrote:
> Should we perhaps CC qemu-devel here for an opinion.
> 
> Guys, this mail should explain the issue but in case there are
> questions, the whole thread starts here:
> 
> http://lkml.kernel.org/r/20140407111725.GC25152@localhost
> 
> Thanks.
> 
> On Sat, Apr 12, 2014 at 01:35:49AM +0800, Jet Chen wrote:
>> On 04/12/2014 12:33 AM, H. Peter Anvin wrote:
>>> On 04/11/2014 06:51 AM, Romer, Benjamin M wrote:
>>>>
>>>>> I'm still confused where KVM comes into the picture.  Are you actually
>>>>> using KVM (and thus talking about nested virtualization) or are you
>>>>> using Qemu in JIT mode and running another hypervisor underneath?
>>>>
>>>> The test that Fengguang used to find the problem was running the linux
>>>> kernel directly using KVM. When the kernel was run with "-cpu Haswell,
>>>> +smep,+smap" set, the vmcall failed with invalid op, but when the kernel
>>>> is run with "-cpu qemu64", the vmcall causes a vmexit, as it should.
>>>
>>> As far as I know, Fengguang's test doesn't use KVM at all, it runs Qemu
>>> as a JIT.  Completely different thing.  In that case Qemu probably
>>> should *not* set the hypervisor bit.  However, the only thing that the
>>> hypervisor bit means is that you can look for specific hypervisor APIs
>>> in CPUID level 0x40000000+.
>>>
>>>> My point is, the vmcall was made because the hypervisor bit was set. If
>>>> this bit had been turned off, as it would be on a real processor, the
>>>> vmcall wouldn't have happened.
>>>
>>> And my point is that that is a bug.  In the driver.  A very serious one.
>>>  You cannot call VMCALL until you know *which* hypervisor API(s) you
>>> have available, period.
>>>
>>>>> The hypervisor bit is a complete red herring. If the guest CPU is
>>>>> running in VT-x mode, then VMCALL should VMEXIT inside the guest
>>>>> (invoking the guest root VT-x), 
>>>>
>>>> The CPU is running in VT-X. That was my point, the kernel is running in
>>>> the KVM guest, and KVM is setting the CPU feature bits such that bit 31
>>>> is enabled.
>>>
>>> Which it is because it wants to export the KVM hypercall interface.
>>> However, keying VMCALL *only* on the HYPERVISOR bit is wrong in the extreme.
>>>
>>>> I don't think it's a red herring because the kernel uses this bit
>>>> elsewhere - it is reported as X86_FEATURE_HYPERVISOR in the CPU
>>>> features, and can be checked with the cpu_has_hypervisor macro (which
>>>> was not used by the original author of the code in the driver, but
>>>> should have been). VMWare and KVM support in the kernel also check for
>>>> this bit before checking their hypervisor leaves for an ID. If it's not
>>>> properly set it affects more than just the s-Par drivers.
>>>>
>>>>> but the fact still remains that you
>>>>> should never, ever, invoke VMCALL unless you know what hypervisor you
>>>>> have underneath.
>>>>
>>>> From the standpoint of the s-Par drivers, yes, I agree (as I already
>>>> said). However, VMCALL is not a privileged instruction, so anyone could
>>>> use it from user space and go right past the OS straight to the
>>>> hypervisor. IMHO, making it *lethal* to the guest is a bad idea, since
>>>> any user could hard-stop the guest with a couple of lines of C.
>>>
>>> Typically the hypervisor wants to generate a #UD inside of the guest for
>>> that case.  The guest OS will intercept it and SIGILL the user space
>>> process.
>>>
>>> 	-hpa
>>>
>>
>> Hi Ben,
>>
>> I re-tested this case with/without option -enable-kvm.
>>
>> qemu-system-x86_64 -cpu Haswell,+smep,+smap			invalid op
>> qemu-system-x86_64 -cpu kvm64					invalid op
>> qemu-system-x86_64 -cpu Haswell,+smep,+smap -enable-kvm 	everything OK
>> qemu-system-x86_64 -cpu kvm64 -enable-kvm			everything OK
>>
>> I think this is probably a bug in QEMU.
>> Sorry for misleading you. I am not experienced in QEMU usage. I don't realize I need try this case with different options Until read Peter's reply.
>>
>> As Peter said, QEMU probably should *not* set the hypervisor bit. But based on my testing, I think KVM works properly in this case.
>>
>> Thanks,
>> Jet
>>
> 

View attachment "dmesg-quantal-f4-128:20140407182830:x86_64-randconfig-br0-04050702:3.14.0-rc5-00621-g12e364b:1" of type "text/plain" (61275 bytes)

View attachment "config-3.14.0-rc5-00621-g12e364b" of type "text/plain" (99028 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ