[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8736sbkq2w.fsf@xmission.com>
Date: Thu, 08 Nov 2018 19:16:39 -0600
From: ebiederm@...ssion.com (Eric W. Biederman)
To: Rian Quinn <rianquinn@...il.com>
Cc: linux-kernel@...r.kernel.org, x86@...nel.org
Subject: Re: x86_64 INIT/SIPI Bug
Rian Quinn <rianquinn@...il.com> writes:
> I apologize upfront if this is the wrong place to post this, pretty new to this.
>
> We are working on the Bareflank Hypervisor (www.bareflank.org), and we
> are passing through the INIT/SIPI process (similar to how a VMX
> rootkit from EFI might boot the OS) and we noticed that on Arch Linux,
> the INIT/SIPI process stalls, something we are not seeing on Ubuntu.
>
> Turns out, to fix the issue, we had to turn on cpu_init_udelay=10000.
> The problem is, once a hypervisor is turned on, even one that is doing
> nothing but passing through the instructions, the "quick" that is
> detailed below fails as the kernel is not giving the CPU enough time
> to perform a VMExit/VMEntry (even through the VMExit is not doing
> anything).
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/kernel/smpboot.c?h=v4.20-rc1#n650
>
> You can see our INIT/SIPI code here if you are interested:
> https://github.com/rianquinn/extended_apis/blob/hyperkernel_1/bfvmm/src/hve/arch/intel_x64/vmexit/init_signal.cpp
>
> The reason I suggest this is a bug is the manual clearly states that a
> wait is required and the "quirk" that turns off this delay prevents
> code like this from working. Would it be possible to either:
> - Turn this off by default, but still allow someone to turn it on if
> they are confident the delay is not needed?
> - Provide a generic way to turn this off (maybe if a hypervisor is
> detected, it defaults to off)?
>
> I'd be more than happy to provide a patch and test, but I'm not sure
> if there is any interest in changing this code.
I would suggest testing either for your hypervisor or simply for being
inside some hypervisor. As I read the code it is only turning off the
10ms delay on processors where it is know that it is safe to do so.
This is a case where it is not safe to disable the 10ms delay,
so it makes sense not not turn off the delay.
Eric
Powered by blists - more mailing lists