lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aa1d858f-ad45-150c-2bbb-97523ce78e22@gmail.com>
Date:   Wed, 26 Apr 2023 14:25:53 +0800
From:   Like Xu <like.xu.linux@...il.com>
To:     Sandipan Das <sandipan.das@....com>,
        Sean Christopherson <seanjc@...gle.com>
Cc:     Paolo Bonzini <pbonzini@...hat.com>,
        Ravi Bangoria <ravi.bangoria@....com>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org,
        Santosh Shukla <santosh.shukla@....com>,
        "Tom Lendacky (AMD)" <thomas.lendacky@....com>,
        Ananth Narayan <ananth.narayan@....com>
Subject: Re: [PATCH 5/5] KVM: x86/pmu: Hide guest counter updates from the
 VMRUN instruction

On 26/4/2023 1:25 pm, Sandipan Das wrote:
> Hi Sean, Like,
> 
> On 4/19/2023 7:11 PM, Like Xu wrote:
>> On 7/4/2023 10:56 pm, Sean Christopherson wrote:
>>> On Fri, Apr 07, 2023, Like Xu wrote:
>>>> On 7/4/2023 10:18 am, Sean Christopherson wrote:
>>>>> Wait, really?  VMRUN is counted if and only if it enters to a CPL0 guest?  Can
>>>>> someone from AMD confirm this?  I was going to say we should just treat this as
>>>>> "normal" behavior, but counting CPL0 but not CPL>0 is definitely quirky.
>>>>
>>>> VMRUN is only counted on a CPL0-target (branch) instruction counter.
>>>
>>> Yes or no question: if KVM does VMRUN and a PMC is programmed to count _all_ taken
>>> branches, will the PMC count VMRUN as a branch if guest CPL>0 according to the VMCB?
>>
>> YES, my quick tests (based on run_in_user() from KUT on Zen4) show:
>>
>> EVENTSEL_GUESTONLY + EVENTSEL_ALL + VMRUN_to_USR -> AMD_ZEN_BR_RETIRED + 1
>> EVENTSEL_GUESTONLY + EVENTSEL_ALL + VMRUN_to_OS -> AMD_ZEN_BR_RETIRED + 1
>>
>> EVENTSEL_GUESTONLY + EVENTSEL_USR + VMRUN_to_USR -> AMD_ZEN_BR_RETIRED + 1
>> EVENTSEL_GUESTONLY + EVENTSEL_OS + VMRUN_to_OS -> AMD_ZEN_BR_RETIRED + 1
>>
>> VENTSEL_GUESTONLY + EVENTSEL_OS + VMRUN_to_USR -> No change
>> VENTSEL_GUESTONLY + EVENTSEL_USR + VMRUN_to_OS -> No change
>>
>> I'm actually not surprised and related test would be posted later.
>>
>>>
>>>> This issue makes a guest CPL0-target instruction counter inexplicably
>>>> increase, as if it would have been under-counted before the virtualization
>>>> instructions were counted.
>>>
>>> Heh, it's very much explicable, it's just not desirable, and you and I would argue
>>> that it's also incorrect.
>>
>> This is completely inaccurate from the end guest pmu user's perspective.
>>
>> I have a toy that looks like virtio-pmu, through which guest users can get hypervisor performance data.
>> But the side effect of letting the guest see the VMRUN instruction by default is unacceptable, isn't it ?
>>
>>>
>>> AMD folks, are there plans to document this as an erratum?  I agree with Like that
>>> counting VMRUN as a taken branch in guest context is a CPU bug, even if the behavior
>>> is known/expected.
>>
> 
> This behaviour is architectural and an erratum will not be issued. However, for clarity, a future
> release of the APM will include additional details like the following:
> 
>    1) From the perspective of performance monitoring counters, VMRUNs are considered as far control
>       transfers and VMEXITs as exceptions.
> 
>    2) When the performance monitoring counters are set up to count events only in certain modes
>       through the "OsUserMode" and "HostGuestOnly" bits, instructions and events that change the
>       mode are counted in the target mode. For example, a SYSCALL from CPL 3 to CPL 0 with a
>       counter set to count retired instructions with USR=1 and OS=0 will not cause an increment of
>       the counter. However, the SYSRET back from CPL 0 to CPL 3 will cause an increment of the
>       counter and the total count will end up correct. Similarly, when counting PMCx0C6 (retired
>       far control transfers, including exceptions and interrupts) with Guest=1 and Host=0, a VMRUN
>       instruction will cause an increment of the counter. However, the subsequent VMEXIT that occurs,
>       since the target is in the host, will not cause an increment of the counter and so the total
>       count will end up correct.
> 

Thanks for the clarification, that fits my understanding.

"Calculated in target mode" and "correct total count" are architectural choices,
which is not a problem if the consumers of PMU data are on the same side.

But for a VM user, seeing SYSRET in the user mode is completely and functionally
different from seeing VMRUN in the guest context. Since the host user and
the guest user are two separate pmu data consumers, and they do not aggregate
or share the so-called "total" PMU data.

This situation is even worse for nested SVM guests and SEV-SNP guests.

I'm not urging that AMD hardware should change, but it is entirely necessary
for our software layer to take this step, as it is part of the hypervisor's 
responsibility
to hide itself by default.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ