[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f858f28f-6d22-b53b-9572-5220b9ecf81c@redhat.com>
Date: Thu, 18 Apr 2019 18:47:56 +0200
From: Paolo Bonzini <pbonzini@...hat.com>
To: Sean Christopherson <sean.j.christopherson@...el.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>
Cc: kvm@...r.kernel.org,
Radim Krčmář <rkrcmar@...hat.com>,
Roman Kagan <rkagan@...tuozzo.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] x86: kvm: hyper-v: deal with buggy TLB flush requests
from WS2012
On 18/04/19 16:17, Sean Christopherson wrote:
> On Wed, Mar 20, 2019 at 06:43:20PM +0100, Vitaly Kuznetsov wrote:
>> It was reported that with some special Multi Processor Group configuration,
>> e.g:
>> bcdedit.exe /set groupsize 1
>> bcdedit.exe /set maxgroup on
>> bcdedit.exe /set groupaware on
>> for a 16-vCPU guest WS2012 shows BSOD on boot when PV TLB flush mechanism
>> is in use.
>>
>> Tracing kvm_hv_flush_tlb immediately reveals the issue:
>>
>> kvm_hv_flush_tlb: processor_mask 0x0 address_space 0x0 flags 0x2
>>
>> The only flag set in this request is HV_FLUSH_ALL_VIRTUAL_ADDRESS_SPACES,
>> however, processor_mask is 0x0 and no HV_FLUSH_ALL_PROCESSORS is specified.
>> We don't flush anything and apparently it's not what Windows expects.
>>
>> TLFS doesn't say anything about such requests and newer Windows versions
>> seem to be unaffected. This all feels like a WS2012 bug, which is, however,
>> easy to workaround in KVM: let's flush everything when we see an empty
>> flush request, over-flushing doesn't hurt.
>>
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@...hat.com>
>> ---
>> arch/x86/kvm/hyperv.c | 12 +++++++++++-
>> 1 file changed, 11 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
>> index 421899f6ad7b..5887f7d22ac6 100644
>> --- a/arch/x86/kvm/hyperv.c
>> +++ b/arch/x86/kvm/hyperv.c
>> @@ -1371,7 +1371,17 @@ static u64 kvm_hv_flush_tlb(struct kvm_vcpu *current_vcpu, u64 ingpa,
>>
>> valid_bank_mask = BIT_ULL(0);
>> sparse_banks[0] = flush.processor_mask;
>> - all_cpus = flush.flags & HV_FLUSH_ALL_PROCESSORS;
>> +
>> + /*
>> + * WS2012 seems to be buggy, under certain conditions it is
>> + * possible to observe requests with processor_mask = 0x0 and
>> + * no HV_FLUSH_ALL_PROCESSORS flag set. It also seems that
>
> "and no HV_FLUSH_ALL_PROCESSORS flag set" is awkward, and probably
> extraneous. The whole comment is a probably a bit more verbose than it
> needs to be, e.g. most readers won't care how we came to the conclusion
> that 'processor_mask == 0', and those that care about the background will
> read the changelog anyways.
>
> Maybe something like this:
>
> /*
> * Some Windows versions, e.g. WS2012, use processor_mask = 0
> * in lieu of the dedicated flag to flush all processors.
> */
Hmm, not really. "In lieu" seems intentional. "without" is more accurate.
My take:
* Work around possible WS2012 bug: it sends hypercalls
* with processor_mask = 0x0 and HV_FLUSH_ALL_PROCESSORS clear,
* while also expecting us to flush something and crashing if
* we don't. Let's treat processor_mask == 0 same as
* HV_FLUSH_ALL_PROCESSORS.
*/
Paolo
>
>
>> + * Windows actually expects us to flush something and crashes
>> + * otherwise. Let's treat processor_mask == 0 same as
>> + * HV_FLUSH_ALL_PROCESSORS.
>> + */
>> + all_cpus = (flush.flags & HV_FLUSH_ALL_PROCESSORS) ||
>> + (flush.processor_mask == 0);
>
> Nits:
>
> Personal preference, but I like '!flush.processor_mask' in this case as it
> immediately conveys that we're handling the scenario where the guest didn't
> set a mask. Then there wouldn't be a visual need for the second set of
> parentheses.
>
> Aligning its indentation with the first first chunk of the statement would
> also be nice, but again, personal preference. :-)
>
>> } else {
>> if (unlikely(kvm_read_guest(kvm, ingpa, &flush_ex,
>> sizeof(flush_ex))))
>> --
>> 2.20.1
>>
Powered by blists - more mailing lists