lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 28 Jun 2023 13:37:44 +0800
From:   Like Xu <like.xu.linux@...il.com>
To:     "Zhang, Xiong Y" <xiong.y.zhang@...el.com>,
        "Christopherson,, Sean" <seanjc@...gle.com>
Cc:     "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "pbonzini@...hat.com" <pbonzini@...hat.com>,
        "peterz@...radead.org" <peterz@...radead.org>,
        "kan.liang@...ux.intel.com" <kan.liang@...ux.intel.com>,
        "zhenyuw@...ux.intel.com" <zhenyuw@...ux.intel.com>,
        "Lv, Zhiyuan" <zhiyuan.lv@...el.com>
Subject: Re: [PATCH 2/4] KVM: VMX/pmu: Save host debugctlmsr just before vm
 entry

On 25/6/2023 12:03 pm, Zhang, Xiong Y wrote:
>> On Fri, Jun 16, 2023, Xiong Zhang wrote:
>>> Perf defines four types of perf event: per cpu pinned event, per
>>> process pinned event, per cpu event, per process event, their
>>> prioirity are from high to low. vLBR event is per process pinned
>>> event. So durng vm exit handler, if vLBR event preempts perf low
>>> priority LBR event, perf will disable LBR and let guest control LBR,
>>> or if vLBR event is preempted by perf high priority LBR event, perf
>>> will enable LBR. In a word LBR status may be changed during vm exit handler.
>>>
>>> MSR_IA32_DEBUGCTLMSR[0] controls LBR enabling, kvm saves its value
>>> into
>>> vmx->host_debugctlmsr in vcpu_load(), and kvm restores its value from
>>> vmx->host_debugctlmsr after vm exit immediately. Since
>>> MSR_IA32_DEBUGCTLMSR[0] could be changed during vm exit handler, the
>>> saved value vmx->host_debugctlmsr could be wrong. So this commit saves
>>> MSR_IA32_DEBUGCTLMSR into vmx->host_debugctlmsr just before vm entry
>>> to reflect the real hardware value.
>>>
>>> Signed-off-by: Xiong Zhang <xiong.y.zhang@...el.com>
>>> ---
>>>   arch/x86/kvm/vmx/vmx.c | 5 +----
>>>   1 file changed, 1 insertion(+), 4 deletions(-)
>>>
>>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index
>>> 44fb619803b8..5ca61a26d0d7 100644
>>> --- a/arch/x86/kvm/vmx/vmx.c
>>> +++ b/arch/x86/kvm/vmx/vmx.c
>>> @@ -1459,13 +1459,9 @@ void vmx_vcpu_load_vmcs(struct kvm_vcpu *vcpu,
>> int cpu,
>>>    */
>>>   static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu)  {
>>> -	struct vcpu_vmx *vmx = to_vmx(vcpu);
>>> -
>>>   	vmx_vcpu_load_vmcs(vcpu, cpu, NULL);
>>>
>>>   	vmx_vcpu_pi_load(vcpu, cpu);
>>> -
>>> -	vmx->host_debugctlmsr = get_debugctlmsr();
>>>   }
>>>
>>>   static void vmx_vcpu_put(struct kvm_vcpu *vcpu) @@ -7273,6 +7269,7 @@
>>> static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
>>>   	atomic_switch_perf_msrs(vmx);
>>>   	if (intel_pmu_lbr_is_enabled(vcpu))
>>>   		vmx_passthrough_lbr_msrs(vcpu);
>>> +	vmx->host_debugctlmsr = get_debugctlmsr();
>>
>> Reading DEBUG_CTL on every VM-Entry is either unnecessary or insufficient.  If
>> the DEBUG_CTL value is being changed synchronously, then just fix whatever
>> KVM path leads to a change in the host avlue.  If DEBUG_CTL is being changed
>> asynchronously, then I'm guessing the change is coming from NMI context,
>> which means that KVM is buggy no matter how close we put this to VM-Enter.
> When a perf event reschedule is needed on a physical cpu, perf scheduler send an IPI to the target cpu, LBR will be enabled or disabled in the IPI handler according to active event attribute.
> If vLBR event is active, LBR is disabled in IPI handler.
> If Host LBR event is active, LBR is enabled in the IPI handler, this could happen when host LBR event preempt vLBR event during vm exit handler.
> DEBUG_CTL[0]'s changing is asynchronous in the perf IPI handler,  host irq is disabled near VM-Enter, so IPI couldn't happen, then host DEBUG_CTL[0] couldn't change before kvm enable host irq.
> Perf event counter overflow (PMI) is a NMI, but this NMI handler doesn't change LBR status, the kvm saved host_debugctlmsr is correct still after PMI handler.
> 
> thanks

This is not true. One example is Freezing LBRs on PMI (bit 11) in the host NMI ctx.

For "Legacy Freeze_LBR_on_PMI" feature on a host, "the LBR is frozen on the
overflowed condition of the buffer area, the processor clears the LBR bit
(bit 0) in IA32_DEBUGCTL."

Not to mention that the commit message makes no mention of the effect of
this change on other features on DEBUG_CTL.

I couldn't agree with Sean more here. I think the first is to make sure that 
debugctl's
functionality is not broken in both root mode and non-root mode, followed closely
by what policy should be set and notified to any user if host/kvm are not in a
position to support either side.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ