linux-kernel - Re: [PATCH] KVM: X86: correct trace_kvm_pv_tlb

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <Ynpq1hmtO+Yu21J2@google.com>
Date:   Tue, 10 May 2022 13:38:30 +0000
From:   Sean Christopherson <seanjc@...gle.com>
To:     Vitaly Kuznetsov <vkuznets@...hat.com>
Cc:     Jon Kohler <jon@...anix.com>, Paolo Bonzini <pbonzini@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        "x86@...nel.org" <x86@...nel.org>,
        "H. Peter Anvin" <hpa@...or.com>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] KVM: X86: correct trace_kvm_pv_tlb_flush stats

On Tue, May 10, 2022, Vitaly Kuznetsov wrote:
> Jon Kohler <jon@...anix.com> writes:
> 
> >> On May 5, 2022, at 4:09 AM, Vitaly Kuznetsov <vkuznets@...hat.com> wrote:
> >> 
> >> Jon Kohler <jon@...anix.com> writes:
> >> 
> >>>> On May 4, 2022, at 5:47 PM, Sean Christopherson <seanjc@...gle.com> wrote:
> >>>> 
> >> 
> >> ...
> >> 
> >>> 
> >>> The net problem here is really that the stat is likely incorrect; however,
> >>> one other oddity I didn’t quite understand after looking into this is that
> >>> the call site for all of this is in record_steal_time(), which is only called
> >>> from vcpu_enter_guest(), and that is called *after*
> >>> kvm_service_local_tlb_flush_requests(), which also calls
> >>> kvm_vcpu_flush_tlb_guest() if request == KVM_REQ_TLB_FLUSH_GUEST
> >>> 
> >>> That request may be there set from a few different places. 
> >>> 
> >>> I don’t have any proof of this, but it seems to me like we might have a
> >>> situation where we double flush?
> >>> 
> >>> Put another way, I wonder if there is any sense behind maybe hoisting
> >>> if (kvm_check_request(KVM_REQ_STEAL_UPDATE, vcpu)) up before
> >>> Other tlb flushes, and have it clear the FLUSH_GUEST if it was set?
> >> 
> >> Indeed, if we move KVM_REQ_STEAL_UPDATE check/record_steal_time() call
> >> in vcpu_enter_guest() before kvm_service_local_tlb_flush_requests(), we
> >> can probably get aways with kvm_make_request(KVM_REQ_TLB_FLUSH_GUEST,
> >> vcpu) in record_steal_time() which would help to avoid double flushing.
> >
> > Thanks, Vitaly, I’ll rework this one and incorporate that. In the mean time, do you
> > have any suggestions on Sean's concern about losing the trace in situations
> > where pv tlb flushing isn’t happening?
> >
> 
> No strong preference from my side but there are multiple places which
> conditionally cause TLB flush but we don't have tracepoints saying
> "flush could've been done but wasn't" there, right?

IMO this one is different because it's an explicit request from the guest that is
otherwise not traced, whereas e.g. INVLPG will show up in exits.

> Also,
> kvm_vcpu_flush_tlb_all()/kvm_vcpu_flush_tlb_guest()/kvm_vcpu_flush_tlb_current()
> don't seem to have tracepoints so we don't actually record when we
> flush. Hyper-V TLB flush has its own tracepoints
> (trace_kvm_hv_flush_tlb()/trace_kvm_hv_flush_tlb_ex()) though.
> This probably deserves a cleanup if we want TLB flush to be debuggable
> without code instrumentation.

I don't have a preference either way.  I'm not opposed to tracing flushes, but I'm
also more than a bit skeptical that any non-trivial TLB bugs will be debuggable via
tracepoints.