linux-kernel - Re: [PATCH 2/7] KVM: x86: Implement Hyper-V's vCPU suspended state

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <9ef935db-459a-4738-ab9a-4bd08828cb60@gmx.de>
Date: Mon, 14 Oct 2024 19:50:17 +0200
From: Nikolas Wipper <nik.wipper@....de>
To: Vitaly Kuznetsov <vkuznets@...hat.com>, Nikolas Wipper <nikwip@...zon.de>
Cc: Nicolas Saenz Julienne <nsaenz@...zon.com>,
 Alexander Graf <graf@...zon.de>, James Gowans <jgowans@...zon.com>,
 nh-open-source@...zon.com, Sean Christopherson <seanjc@...gle.com>,
 Paolo Bonzini <pbonzini@...hat.com>, Thomas Gleixner <tglx@...utronix.de>,
 Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
 Dave Hansen <dave.hansen@...ux.intel.com>, linux-kernel@...r.kernel.org,
 kvm@...r.kernel.org, x86@...nel.org, linux-doc@...r.kernel.org,
 linux-kselftest@...r.kernel.org
Subject: Re: [PATCH 2/7] KVM: x86: Implement Hyper-V's vCPU suspended state

On 10.10.24 10:57, Vitaly Kuznetsov wrote:
> Nikolas Wipper <nikwip@...zon.de> writes:
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index 46e0a466d7fb..7571ac578884 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -695,6 +695,9 @@ struct kvm_vcpu_hv {
>>  		u64 vm_id;
>>  		u32 vp_id;
>>  	} nested;
>> +
>> +	bool suspended;
>> +	int waiting_on;
>
> I don't quite understand why we need 'suspended' at all. Isn't it always
> suspended when 'waiting_on != -1'? I can see we always update these two
> in pair.
>

This is mainly for future proofing the implementation. You are right, this
is currently not required, but it's nice to have a single flags, so that
when the suspended state is used in a different context, the whole logic
surrounding it still works.

> Also, I would suggest we use a more descriptive
> name. 'waiting_on_vcpu_id', for example.
>

Sounds good.

>>  };
>>
>>  struct kvm_hypervisor_cpuid {
>> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
>> index 4f0a94346d00..6e7941ed25ae 100644
>> --- a/arch/x86/kvm/hyperv.c
>> +++ b/arch/x86/kvm/hyperv.c
>> @@ -971,6 +971,7 @@ int kvm_hv_vcpu_init(struct kvm_vcpu *vcpu)
>>
>>  	vcpu->arch.hyperv = hv_vcpu;
>>  	hv_vcpu->vcpu = vcpu;
>> +	hv_vcpu->waiting_on = -1;
>>
>>  	synic_init(&hv_vcpu->synic);
>>
>> @@ -2915,3 +2916,32 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
>>
>>  	return 0;
>>  }
>> +
>> +void kvm_hv_vcpu_suspend_tlb_flush(struct kvm_vcpu *vcpu, int vcpu_id)
>
> Can we make parameter's name 'waiting_on_vcpu_id' as well? Because as-is
> I'm getting confused which CPU of these two is actually getting
> suspended)
>

Yup, that would certainly help readability.

> Also, why do we need '_tlb_flush' in the name? The mechanism seems to be
> fairly generic, it's just that we use it for TLB flushes.
>

The 'waiting_on' part is TLB flushing specific.

>> +{
>> +	/* waiting_on's store should happen before suspended's */
>> +	WRITE_ONCE(vcpu->arch.hyperv->waiting_on, vcpu_id);
>> +	WRITE_ONCE(vcpu->arch.hyperv->suspended, true);
>> +}
>> +
>> +void kvm_hv_vcpu_unsuspend_tlb_flush(struct kvm_vcpu *vcpu)
>
> And here someone may expect this means 'unsuspend vcpu' but in reality
> this means 'unsuspend all vCPUs which are waiting on 'vcpu'). I guess we
> need a rename. How about
>
> void kvm_hv_unsuspend_vcpus(struct kvm_vcpu *waiting_on_vcpu)
>
> ?
>

Also sounds good.

>> +{
>> +	DECLARE_BITMAP(vcpu_mask, KVM_MAX_VCPUS);
>> +	struct kvm_vcpu_hv *vcpu_hv;
>> +	struct kvm_vcpu *v;
>> +	unsigned long i;
>> +
>> +	kvm_for_each_vcpu(i, v, vcpu->kvm) {
>> +		vcpu_hv = to_hv_vcpu(v);
>> +
>> +		if (kvm_hv_vcpu_suspended(v) &&
>> +		    READ_ONCE(vcpu_hv->waiting_on) == vcpu->vcpu_id) {
>> +			/* waiting_on's store should happen before suspended's */
>> +			WRITE_ONCE(v->arch.hyperv->waiting_on, -1);
>> +			WRITE_ONCE(v->arch.hyperv->suspended, false);
>> +			__set_bit(i, vcpu_mask);
>> +		}
>> +	}
>> +
>> +	kvm_make_vcpus_request_mask(vcpu->kvm, KVM_REQ_EVENT, vcpu_mask);
>> +}
>> diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
>> index 913bfc96959c..a55832cea221 100644
>> --- a/arch/x86/kvm/hyperv.h
>> +++ b/arch/x86/kvm/hyperv.h
>> @@ -265,6 +265,15 @@ static inline void kvm_hv_nested_transtion_tlb_flush(struct kvm_vcpu *vcpu,
>>  }
>>
>>  int kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu);
>> +
>> +static inline bool kvm_hv_vcpu_suspended(struct kvm_vcpu *vcpu)
>> +{
>> +	return vcpu->arch.hyperv_enabled &&
>> +	       READ_ONCE(vcpu->arch.hyperv->suspended);
>
> I don't think READ_ONCE() means anything here, does it?
>

It does prevent compiler optimisations and is actually required[1]. Also
it makes clear that this variable is shared, and may be accessed from
remote CPUs.

[1] https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0124r6.html#Variable%20Access

Nikolas