lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 14 Jul 2022 17:05:42 +0200
From:   Vitaly Kuznetsov <vkuznets@...hat.com>
To:     Maxim Levitsky <mlevitsk@...hat.com>, kvm@...r.kernel.org
Cc:     Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        linux-kernel@...r.kernel.org, Paolo Bonzini <pbonzini@...hat.com>,
        Sean Christopherson <seanjc@...gle.com>
Subject: Re: [PATCH 1/3] KVM: x86: Hyper-V invariant TSC control

Maxim Levitsky <mlevitsk@...hat.com> writes:

> On Wed, 2022-07-13 at 17:05 +0200, Vitaly Kuznetsov wrote:
>> Normally, genuine Hyper-V doesn't expose architectural invariant TSC
>> (CPUID.80000007H:EDX[8]) to its guests by default. A special PV MSR
>> (HV_X64_MSR_TSC_INVARIANT_CONTROL, 0x40000118) and corresponding CPUID
>> feature bit (CPUID.0x40000003.EAX[15]) were introduced. When bit 0 of the
>> PV MSR is set, invariant TSC bit starts to show up in CPUID. When the
>> feature is exposed to Hyper-V guests, reenlightenment becomes unneeded.
>> 
>> Add the feature to KVM. Keep CPUID output intact when the feature
>> wasn't exposed to L1 and implement the required logic for hiding
>> invariant TSC when the feature was exposed and invariant TSC control
>> MSR wasn't written to. Copy genuine Hyper-V behavior and forbid to
>> disable the feature once it was enabled.
>> 
>> For the reference, for linux guests, support for the feature was added
>> in commit dce7cd62754b ("x86/hyperv: Allow guests to enable InvariantTSC").
>> 
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@...hat.com>
>> ---
>>  arch/x86/include/asm/kvm_host.h |  1 +
>>  arch/x86/kvm/cpuid.c            |  7 +++++++
>>  arch/x86/kvm/hyperv.c           | 19 +++++++++++++++++++
>>  arch/x86/kvm/hyperv.h           | 15 +++++++++++++++
>>  arch/x86/kvm/x86.c              |  4 +++-
>>  5 files changed, 45 insertions(+), 1 deletion(-)
>> 
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index de5a149d0971..88553f0b524c 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -1022,6 +1022,7 @@ struct kvm_hv {
>>         u64 hv_reenlightenment_control;
>>         u64 hv_tsc_emulation_control;
>>         u64 hv_tsc_emulation_status;
>> +       u64 hv_invtsc;
>>  
>>         /* How many vCPUs have VP index != vCPU index */
>>         atomic_t num_mismatched_vp_indexes;
>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>> index d47222ab8e6e..788df2eb1ec4 100644
>> --- a/arch/x86/kvm/cpuid.c
>> +++ b/arch/x86/kvm/cpuid.c
>> @@ -1404,6 +1404,13 @@ bool kvm_cpuid(struct kvm_vcpu *vcpu, u32 *eax, u32 *ebx,
>>                             (data & TSX_CTRL_CPUID_CLEAR))
>>                                 *ebx &= ~(F(RTM) | F(HLE));
>>                 }
>
> Tiny nitpick: Maybe add a bit longer comment about this thing, like that guest needs to opt-in
> to see invtsc when it has the HV feature exposed to it, 
> I don't have a strong preference about this though.
>
>> +               /*
>> +                * Filter out invariant TSC (CPUID.80000007H:EDX[8]) for Hyper-V
>> +                * guests if needed.
>> +                */
>> +               if (function == 0x80000007 && kvm_hv_invtsc_filtered(vcpu))
>> +                       *edx &= ~(1 << 8);
>
>> +
>>         } else {
>>                 *eax = *ebx = *ecx = *edx = 0;
>>                 /*
>> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
>> index e2e95a6fccfd..0d8e6526a839 100644
>> --- a/arch/x86/kvm/hyperv.c
>> +++ b/arch/x86/kvm/hyperv.c
>> @@ -991,6 +991,7 @@ static bool kvm_hv_msr_partition_wide(u32 msr)
>>         case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
>>         case HV_X64_MSR_TSC_EMULATION_CONTROL:
>>         case HV_X64_MSR_TSC_EMULATION_STATUS:
>> +       case HV_X64_MSR_TSC_INVARIANT_CONTROL:
>>         case HV_X64_MSR_SYNDBG_OPTIONS:
>>         case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
>>                 r = true;
>> @@ -1275,6 +1276,9 @@ static bool hv_check_msr_access(struct kvm_vcpu_hv *hv_vcpu, u32 msr)
>>         case HV_X64_MSR_TSC_EMULATION_STATUS:
>>                 return hv_vcpu->cpuid_cache.features_eax &
>>                         HV_ACCESS_REENLIGHTENMENT;
>> +       case HV_X64_MSR_TSC_INVARIANT_CONTROL:
>> +               return hv_vcpu->cpuid_cache.features_eax &
>> +                       HV_ACCESS_TSC_INVARIANT;
>>         case HV_X64_MSR_CRASH_P0 ... HV_X64_MSR_CRASH_P4:
>>         case HV_X64_MSR_CRASH_CTL:
>>                 return hv_vcpu->cpuid_cache.features_edx &
>> @@ -1402,6 +1406,17 @@ static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data,
>>                 if (!host)
>>                         return 1;
>>                 break;
>> +       case HV_X64_MSR_TSC_INVARIANT_CONTROL:
>> +               /* Only bit 0 is supported */
>> +               if (data & ~BIT_ULL(0))
>> +                       return 1;
>> +
>> +               /* The feature can't be disabled from the guest */
>> +               if (!host && hv->hv_invtsc && !data)
>> +                       return 1;
>
> The unit test in patch 3 claims, that this msr should #GP when 'invtsc'
> aka bit 8 of edx of leaf 0x80000007 is not enabled by the hypervisor
> in the guest cpuid.
>

No, we don't GP when architectural InvTsc is not there....

> Yet, looking at the code I think that this msr read/write access only depends on
> the 'new' cpuid bit, aka the HV_ACCESS_TSC_INVARIANT, thus this msr will 'work'
> but do nothing if 'invtsc' is not exposed (it will then not turn it on).

... as this PV feature just enabled and disables filtering. If
architectural InvTsc is not there then filtering just changes
nothing. VMM is supposed to not enable this without InvTsc itself as
it's kind of pointless.

>
>
>
>> +
>> +               hv->hv_invtsc = data;
>> +               break;
>>         case HV_X64_MSR_SYNDBG_OPTIONS:
>>         case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
>>                 return syndbg_set_msr(vcpu, msr, data, host);
>> @@ -1577,6 +1592,9 @@ static int kvm_hv_get_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata,
>>         case HV_X64_MSR_TSC_EMULATION_STATUS:
>>                 data = hv->hv_tsc_emulation_status;
>>                 break;
>> +       case HV_X64_MSR_TSC_INVARIANT_CONTROL:
>> +               data = hv->hv_invtsc;
>> +               break;
>>         case HV_X64_MSR_SYNDBG_OPTIONS:
>>         case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
>>                 return syndbg_get_msr(vcpu, msr, pdata, host);
>> @@ -2497,6 +2515,7 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
>>                         ent->eax |= HV_MSR_REFERENCE_TSC_AVAILABLE;
>>                         ent->eax |= HV_ACCESS_FREQUENCY_MSRS;
>>                         ent->eax |= HV_ACCESS_REENLIGHTENMENT;
>> +                       ent->eax |= HV_ACCESS_TSC_INVARIANT;
>>  
>>                         ent->ebx |= HV_POST_MESSAGES;
>>                         ent->ebx |= HV_SIGNAL_EVENTS;
>> diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
>> index da2737f2a956..1a6316ab55eb 100644
>> --- a/arch/x86/kvm/hyperv.h
>> +++ b/arch/x86/kvm/hyperv.h
>> @@ -133,6 +133,21 @@ static inline bool kvm_hv_has_stimer_pending(struct kvm_vcpu *vcpu)
>>                              HV_SYNIC_STIMER_COUNT);
>>  }
>>  
>> +/*
>> + * With HV_ACCESS_TSC_INVARIANT feature, invariant TSC (CPUID.80000007H:EDX[8])
>> + * is only observed after HV_X64_MSR_TSC_INVARIANT_CONTROL was written to.
>> + */
>> +static inline bool kvm_hv_invtsc_filtered(struct kvm_vcpu *vcpu)
>> +{
>> +       struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
>> +       struct kvm_hv *hv = to_kvm_hv(vcpu->kvm);
>> +
>> +       if (hv_vcpu && hv_vcpu->cpuid_cache.features_eax & HV_ACCESS_TSC_INVARIANT)
>> +               return !hv->hv_invtsc;
>> +
>> +       return false;
>> +}
>> +
>>  void kvm_hv_process_stimers(struct kvm_vcpu *vcpu);
>>  
>>  void kvm_hv_setup_tsc_page(struct kvm *kvm,
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 567d13405445..322e0a544823 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -1466,7 +1466,7 @@ static const u32 emulated_msrs_all[] = {
>>         HV_X64_MSR_STIMER0_CONFIG,
>>         HV_X64_MSR_VP_ASSIST_PAGE,
>>         HV_X64_MSR_REENLIGHTENMENT_CONTROL, HV_X64_MSR_TSC_EMULATION_CONTROL,
>> -       HV_X64_MSR_TSC_EMULATION_STATUS,
>> +       HV_X64_MSR_TSC_EMULATION_STATUS, HV_X64_MSR_TSC_INVARIANT_CONTROL,
>>         HV_X64_MSR_SYNDBG_OPTIONS,
>>         HV_X64_MSR_SYNDBG_CONTROL, HV_X64_MSR_SYNDBG_STATUS,
>>         HV_X64_MSR_SYNDBG_SEND_BUFFER, HV_X64_MSR_SYNDBG_RECV_BUFFER,
>> @@ -3769,6 +3769,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>         case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
>>         case HV_X64_MSR_TSC_EMULATION_CONTROL:
>>         case HV_X64_MSR_TSC_EMULATION_STATUS:
>> +       case HV_X64_MSR_TSC_INVARIANT_CONTROL:
>>                 return kvm_hv_set_msr_common(vcpu, msr, data,
>>                                              msr_info->host_initiated);
>>         case MSR_IA32_BBL_CR_CTL3:
>> @@ -4139,6 +4140,7 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>>         case HV_X64_MSR_REENLIGHTENMENT_CONTROL:
>>         case HV_X64_MSR_TSC_EMULATION_CONTROL:
>>         case HV_X64_MSR_TSC_EMULATION_STATUS:
>> +       case HV_X64_MSR_TSC_INVARIANT_CONTROL:
>>                 return kvm_hv_get_msr_common(vcpu,
>>                                              msr_info->index, &msr_info->data,
>>                                              msr_info->host_initiated);
>
>
> Beware that this new msr also will need to be migrated by qemu, 
> when the feature is added to qemu -
> I had my own share of fun with AMD's TSC ratio msr when I implemented it
> (had to fix it twice in qemu :( ...)

Yes, we do this for a numbet of Hyper-V MSRs already
(e.g. reenlightnment MSRs).

-- 
Vitaly

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ