lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 9 Nov 2018 11:04:59 -0700
From:   Rian Quinn <rianquinn@...il.com>
To:     sean.j.christopherson@...el.com
Cc:     linux-kernel@...r.kernel.org
Subject: Re: x86_64 INIT/SIPI Bug

>> I apologize upfront if this is the wrong place to post this, pretty new to this.
>>
>> We are working on the Bareflank Hypervisor (www.bareflank.org), and we
>> are passing through the INIT/SIPI process (similar to how a VMX
>> rootkit from EFI might boot the OS) and we noticed that on Arch Linux,
>> the INIT/SIPI process stalls, something we are not seeing on Ubuntu.
>
> I'm confused, INIT is blocked post-VMXON, what are you passing through?

You are correct that INIT will track unconditionally, but all we do is set the
activity state to wait-for-sipi and return back, allowing Linux to continue
its boot process. The code can be seen here:
https://github.com/rianquinn/extended_apis/blob/hyperkernel_1/bfvmm/src/hve/arch/intel_x64/vmexit/init_signal.cpp

It should be noted that this works great for Linux and Windows, allowing us
to boot pretty much any OS that we want with the hypervisor running
(kind of like a VMX rootkit as no traps are really occurring except CPUID
once the OS is loaded). We are doing something very similar to Intel's KGT,
and Xen's PVH dom0.

The problem is, as we started working this we noticed that Ubuntu was booting
fine, but Arch wasn't and it turns out that Arch must be compiling the kernel
with this optimization enabled. Once it is enabled, the kernel basically
sends two SIPI commands before the AP has a chance to trap INIT, set
the activity
state, and then reenter, which causes both SIPIs to get dropped by hardware
. In other words, since Linux is not waiting to send the first SIPI
like the manual states, the SIPIs are lost if a hypervisor is enabled, even
if the hypervisor is doing the least possible amount of code (just setting the
activity state and returning). The working solution is either us a Linux
distribution that disables this optimization like Ubuntu, or to provide the
Linux kernel with the boot param to tell it to add the delay.

To root of the issue is the quirk is assuming that the CPU can handle the
INIT/SIPI/SIPI without a delay, but this assumption doesn't hold if the INIT
first has to trap to a hypervisor (regardless of the hypervisor). In this case,
a delay is still needed.
On Fri, Nov 9, 2018 at 10:49 AM Sean Christopherson
<sean.j.christopherson@...el.com> wrote:
>
> On Thu, Nov 08, 2018 at 03:23:59PM -0700, Rian Quinn wrote:
> > I apologize upfront if this is the wrong place to post this, pretty new to this.
> >
> > We are working on the Bareflank Hypervisor (www.bareflank.org), and we
> > are passing through the INIT/SIPI process (similar to how a VMX
> > rootkit from EFI might boot the OS) and we noticed that on Arch Linux,
> > the INIT/SIPI process stalls, something we are not seeing on Ubuntu.
>
> I'm confused, INIT is blocked post-VMXON, what are you passing through?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ