linux-kernel - Re: [PATCH 5/5] KVM: x86/pmu: Hide guest counter updates from the VMRUN instruction

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <509b697f-4e60-94e5-f785-95f7f0a14006@gmail.com>
Date:   Fri, 7 Apr 2023 16:15:24 +0800
From:   Like Xu <like.xu.linux@...il.com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     Paolo Bonzini <pbonzini@...hat.com>,
        Ravi Bangoria <ravi.bangoria@....com>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH 5/5] KVM: x86/pmu: Hide guest counter updates from the
 VMRUN instruction

On 7/4/2023 10:18 am, Sean Christopherson wrote:
> On Fri, Mar 10, 2023, Like Xu wrote:
>> From: Like Xu <likexu@...cent.com>
>>
>> When AMD guest is counting (branch) instructions event, its vPMU should
>> first subtract one for any relevant (branch)-instructions enabled counter
>> (when it precedes VMRUN and cannot be preempted) to offset the inevitable
>> plus-one effect of the VMRUN instruction immediately follows.
>>
>> Based on a number of micro observations (also the reason why x86_64/
>> pmu_event_filter_test fails on AMD Zen platforms), each VMRUN will
>> increment all hw-(branch)-instructions counters by 1, even if they are
>> only enabled for guest code. This issue seriously affects the performance
>> understanding of guest developers based on (branch) instruction events.
>>
>> If the current physical register value on the hardware is ~0x0, it triggers
>> an overflow in the guest world right after running VMRUN. Although this
>> cannot be avoided on mainstream released hardware, the resulting PMI
>> (if configured) will not be incorrectly injected into the guest by vPMU,
>> since the delayed injection mechanism for a normal counter overflow
>> depends only on the change of pmc->counter values.
> 
> IIUC, this is saying that KVM may get a spurious PMI, but otherwise nothing bad
> will happen?

Guests will have nothing to lose, except gaining vPMI accuracy under this proposal.

When a host gets an overflow interrupt caused by a VMRUN, it forwards it to KVM.
KVM does not inject it into the VM, but discards it. For those using PMU to 
profiling
the hypervisor itself, they lose an interrupt or a sample on VMRUN context.

> 
>> +static inline bool event_is_branch_instruction(struct kvm_pmc *pmc)
>> +{
>> +	return eventsel_match_perf_hw_id(pmc, PERF_COUNT_HW_INSTRUCTIONS) ||
>> +		eventsel_match_perf_hw_id(pmc,
>> +					  PERF_COUNT_HW_BRANCH_INSTRUCTIONS);
>> +}
>> +
>> +static inline bool quirky_pmc_will_count_vmrun(struct kvm_pmc *pmc)
>> +{
>> +	return event_is_branch_instruction(pmc) && event_is_allowed(pmc) &&
>> +		!static_call(kvm_x86_get_cpl)(pmc->vcpu);
> 
> Wait, really?  VMRUN is counted if and only if it enters to a CPL0 guest?  Can
> someone from AMD confirm this?  I was going to say we should just treat this as
> "normal" behavior, but counting CPL0 but not CPL>0 is definitely quirky.

VMRUN is only counted on a CPL0-target (branch) instruction counter. The VMRUN
is not expected to be counted by the guest counters, regardless of the guest CPL.

This issue makes a guest CPL0-target instruction counter inexplicably increase, 
as if it
would have been under-counted before the virtualization instructions were counted.

Treating the host hypervisor instructions like VMRUN as guest workload instructions
is already an error in itself not "normal" behavior that affects guest accuracy.