linux-kernel - Re: [PATCH 4/7] KVM: x86/pmu: Not to generate PEBS records for emulated instructions

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aacf1eb4-26f6-4c62-9c4a-d8249a986c8c@gmail.com>
Date:   Thu, 21 Jul 2022 10:22:14 +0800
From:   Like Xu <like.xu.linux@...il.com>
To:     Sean Christopherson <seanjc@...gle.com>
Cc:     Paolo Bonzini <pbonzini@...hat.com>,
        Jim Mattson <jmattson@...gle.com>,
        linux-kernel@...r.kernel.org, kvm@...r.kernel.org
Subject: Re: [PATCH 4/7] KVM: x86/pmu: Not to generate PEBS records for
 emulated instructions

On 21/7/2022 8:51 am, Sean Christopherson wrote:
> "Don't" instead of "Not to".  Not is an adverb, not a verb itself.
> 
> On Wed, Jul 13, 2022, Like Xu wrote:
>> From: Like Xu <likexu@...cent.com>
>>
>> The KVM accumulate an enabeld counter for at least INSTRUCTIONS or
> 
> Probably just "KVM" instead of "the KVM"?
> 
> s/enabeld/enabled

Applied, thanks.

> 
>> BRANCH_INSTRUCTION hw event from any KVM emulated instructions,
>> generating emulated overflow interrupt on counter overflow, which
>> in theory should also happen when the PEBS counter overflows but
>> it currently lacks this part of the underlying support (e.g. through
>> software injection of records in the irq context or a lazy approach).
>>
>> In this case, KVM skips the injection of this BUFFER_OVF PMI (effectively
>> dropping one PEBS record) and let the overflow counter move on. The loss
>> of a single sample does not introduce a loss of accuracy, but is easily
>> noticeable for certain specific instructions.
>>
>> This issue is expected to be addressed along with the issue
>> of PEBS cross-mapped counters with a slow-path proposal.
>>
>> Fixes: 79f3e3b58386 ("KVM: x86/pmu: Reprogram PEBS event to emulate guest PEBS counter")
>> Signed-off-by: Like Xu <likexu@...cent.com>
>> ---
>>   arch/x86/kvm/pmu.c | 11 ++++++++---
>>   1 file changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
>> index 02f9e4f245bd..08ee0fed63d5 100644
>> --- a/arch/x86/kvm/pmu.c
>> +++ b/arch/x86/kvm/pmu.c
>> @@ -106,9 +106,14 @@ static inline void __kvm_perf_overflow(struct kvm_pmc *pmc, bool in_pmi)
>>   		return;
>>   
>>   	if (pmc->perf_event && pmc->perf_event->attr.precise_ip) {
>> -		/* Indicate PEBS overflow PMI to guest. */
>> -		skip_pmi = __test_and_set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT,
>> -					      (unsigned long *)&pmu->global_status);
>> +		if (!in_pmi) {
>> +			/* The emulated instructions does not generate PEBS records. */
> 
> This needs a better comment.  IIUC, it's not that they don't generate records,
> it's that KVM is _choosing_ to not generate records to hack around a different
> bug(s).  If that's true a TODO or FIXME would also be nice.

Indeed, to understand more of the context, this part will look like this:

		if (!in_pmi) {
			/*
			* TODO: KVM is currently _choosing_ to not generate records
			* for emulated instructions, avoiding BUFFER_OVF PMI when
			* there are no records. Strictly speaking, it should be done
			* as well in the right context to improve sampling accuracy.
			*/
			skip_pmi = true;
		} else {
			/* Indicate PEBS overflow PMI to guest. */
			skip_pmi = __test_and_set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT,
						      (unsigned long *)&pmu->global_status);
		}

, what do you think ?

> 
>> +			skip_pmi = true;
>> +		} else {
>> +			/* Indicate PEBS overflow PMI to guest. */
>> +			skip_pmi = __test_and_set_bit(GLOBAL_STATUS_BUFFER_OVF_BIT,
>> +						      (unsigned long *)&pmu->global_status);
>> +		}
>>   	} else {
>>   		__set_bit(pmc->idx, (unsigned long *)&pmu->global_status);
>>   	}
>> -- 
>> 2.37.0
>>