[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <613d0cfa-e118-1e7e-b313-6c775a2daa14@gmail.com>
Date: Wed, 28 Jun 2023 13:37:44 +0800
From: Like Xu <like.xu.linux@...il.com>
To: "Zhang, Xiong Y" <xiong.y.zhang@...el.com>,
"Christopherson,, Sean" <seanjc@...gle.com>
Cc: "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"pbonzini@...hat.com" <pbonzini@...hat.com>,
"peterz@...radead.org" <peterz@...radead.org>,
"kan.liang@...ux.intel.com" <kan.liang@...ux.intel.com>,
"zhenyuw@...ux.intel.com" <zhenyuw@...ux.intel.com>,
"Lv, Zhiyuan" <zhiyuan.lv@...el.com>
Subject: Re: [PATCH 2/4] KVM: VMX/pmu: Save host debugctlmsr just before vm
entry
On 25/6/2023 12:03 pm, Zhang, Xiong Y wrote:
>> On Fri, Jun 16, 2023, Xiong Zhang wrote:
>>> Perf defines four types of perf event: per cpu pinned event, per
>>> process pinned event, per cpu event, per process event, their
>>> prioirity are from high to low. vLBR event is per process pinned
>>> event. So durng vm exit handler, if vLBR event preempts perf low
>>> priority LBR event, perf will disable LBR and let guest control LBR,
>>> or if vLBR event is preempted by perf high priority LBR event, perf
>>> will enable LBR. In a word LBR status may be changed during vm exit handler.
>>>
>>> MSR_IA32_DEBUGCTLMSR[0] controls LBR enabling, kvm saves its value
>>> into
>>> vmx->host_debugctlmsr in vcpu_load(), and kvm restores its value from
>>> vmx->host_debugctlmsr after vm exit immediately. Since
>>> MSR_IA32_DEBUGCTLMSR[0] could be changed during vm exit handler, the
>>> saved value vmx->host_debugctlmsr could be wrong. So this commit saves
>>> MSR_IA32_DEBUGCTLMSR into vmx->host_debugctlmsr just before vm entry
>>> to reflect the real hardware value.
>>>
>>> Signed-off-by: Xiong Zhang <xiong.y.zhang@...el.com>
>>> ---
>>> arch/x86/kvm/vmx/vmx.c | 5 +----
>>> 1 file changed, 1 insertion(+), 4 deletions(-)
>>>
>>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index
>>> 44fb619803b8..5ca61a26d0d7 100644
>>> --- a/arch/x86/kvm/vmx/vmx.c
>>> +++ b/arch/x86/kvm/vmx/vmx.c
>>> @@ -1459,13 +1459,9 @@ void vmx_vcpu_load_vmcs(struct kvm_vcpu *vcpu,
>> int cpu,
>>> */
>>> static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu) {
>>> - struct vcpu_vmx *vmx = to_vmx(vcpu);
>>> -
>>> vmx_vcpu_load_vmcs(vcpu, cpu, NULL);
>>>
>>> vmx_vcpu_pi_load(vcpu, cpu);
>>> -
>>> - vmx->host_debugctlmsr = get_debugctlmsr();
>>> }
>>>
>>> static void vmx_vcpu_put(struct kvm_vcpu *vcpu) @@ -7273,6 +7269,7 @@
>>> static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
>>> atomic_switch_perf_msrs(vmx);
>>> if (intel_pmu_lbr_is_enabled(vcpu))
>>> vmx_passthrough_lbr_msrs(vcpu);
>>> + vmx->host_debugctlmsr = get_debugctlmsr();
>>
>> Reading DEBUG_CTL on every VM-Entry is either unnecessary or insufficient. If
>> the DEBUG_CTL value is being changed synchronously, then just fix whatever
>> KVM path leads to a change in the host avlue. If DEBUG_CTL is being changed
>> asynchronously, then I'm guessing the change is coming from NMI context,
>> which means that KVM is buggy no matter how close we put this to VM-Enter.
> When a perf event reschedule is needed on a physical cpu, perf scheduler send an IPI to the target cpu, LBR will be enabled or disabled in the IPI handler according to active event attribute.
> If vLBR event is active, LBR is disabled in IPI handler.
> If Host LBR event is active, LBR is enabled in the IPI handler, this could happen when host LBR event preempt vLBR event during vm exit handler.
> DEBUG_CTL[0]'s changing is asynchronous in the perf IPI handler, host irq is disabled near VM-Enter, so IPI couldn't happen, then host DEBUG_CTL[0] couldn't change before kvm enable host irq.
> Perf event counter overflow (PMI) is a NMI, but this NMI handler doesn't change LBR status, the kvm saved host_debugctlmsr is correct still after PMI handler.
>
> thanks
This is not true. One example is Freezing LBRs on PMI (bit 11) in the host NMI ctx.
For "Legacy Freeze_LBR_on_PMI" feature on a host, "the LBR is frozen on the
overflowed condition of the buffer area, the processor clears the LBR bit
(bit 0) in IA32_DEBUGCTL."
Not to mention that the commit message makes no mention of the effect of
this change on other features on DEBUG_CTL.
I couldn't agree with Sean more here. I think the first is to make sure that
debugctl's
functionality is not broken in both root mode and non-root mode, followed closely
by what policy should be set and notified to any user if host/kvm are not in a
position to support either side.
Powered by blists - more mailing lists