lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 3 Aug 2022 16:50:07 +0000
From:   Sean Christopherson <seanjc@...gle.com>
To:     Mingwei Zhang <mizhang@...gle.com>
Cc:     Paolo Bonzini <pbonzini@...hat.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Jim Mattson <jmattson@...gle.com>,
        Joerg Roedel <joro@...tes.org>, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, David Matlack <dmatlack@...gle.com>
Subject: Re: [PATCH v2 1/2] KVM: nested/x86: update trace_kvm_nested_vmrun()
 to suppot VMX

If we want to add a unique identifier for nested x86, use "KVM: x86/nested:" to
align with the MMU and to make `grep "KVM: x86"` viable.  Ideally, we'd align with
nVMX and nSVM, but nx86 is pretty gross and looks like a typo.

I have a slight preference for using a plain "KVM: x86:", as I suspect we'll never
have enough common nested to make it worth differentiating, but I've no objection
if you want to go with "KVM: x86/nested:".

On Mon, Jul 18, 2022, Mingwei Zhang wrote:
> Update trace_kvm_nested_vmrun() to support VMX by adding a new field 'isa';
> update the output to print out VMX/SVM related naming respectively,
> eg., vmcb vs. vmcs; npt vs. ept.
> 
> In addition, print nested EPT/NPT address instead of the 1bit of nested
> ept/npt on/off. This should convey more information in the trace. When
> nested ept/npt is not used, simply print "0x0" so that we don't lose any
> information.

Adding a new field that's not related to the VMX vs. SVM change belongs in a
separate patch.

> Opportunistically update the call site of trace_kvm_nested_vmrun() to make
> one line per parameter.
> 
> Signed-off-by: Mingwei Zhang <mizhang@...gle.com>
> ---
>  arch/x86/kvm/svm/nested.c |  7 +++++--
>  arch/x86/kvm/trace.h      | 29 ++++++++++++++++++++---------
>  2 files changed, 25 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index ba7cd26f438f..8581164b6808 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -724,11 +724,14 @@ int enter_svm_guest_mode(struct kvm_vcpu *vcpu, u64 vmcb12_gpa,
>  	struct vcpu_svm *svm = to_svm(vcpu);
>  	int ret;
>  
> -	trace_kvm_nested_vmrun(svm->vmcb->save.rip, vmcb12_gpa,
> +	trace_kvm_nested_vmrun(svm->vmcb->save.rip,
> +			       vmcb12_gpa,
>  			       vmcb12->save.rip,
>  			       vmcb12->control.int_ctl,
>  			       vmcb12->control.event_inj,
> -			       vmcb12->control.nested_ctl);
> +			       vmcb12->control.nested_ctl,
> +			       vmcb12->control.nested_cr3,
> +			       KVM_ISA_SVM);
>  
>  	trace_kvm_nested_intercepts(vmcb12->control.intercepts[INTERCEPT_CR] & 0xffff,
>  				    vmcb12->control.intercepts[INTERCEPT_CR] >> 16,
> diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
> index de4762517569..aac4c8bd2c3a 100644
> --- a/arch/x86/kvm/trace.h
> +++ b/arch/x86/kvm/trace.h
> @@ -580,8 +580,10 @@ TRACE_EVENT(kvm_pv_eoi,
>   */
>  TRACE_EVENT(kvm_nested_vmrun,

I think it makes sense to rename this to kvm_nested_vmenter.  VMRUN is SVM-only,
and tracepoints aren't ABI.

>  	    TP_PROTO(__u64 rip, __u64 vmcb, __u64 nested_rip, __u32 int_ctl,
> -		     __u32 event_inj, bool npt),
> -	    TP_ARGS(rip, vmcb, nested_rip, int_ctl, event_inj, npt),
> +		     __u32 event_inj, bool npt_enabled, __u64 npt_addr,
> +		     __u32 isa),
> +	    TP_ARGS(rip, vmcb, nested_rip, int_ctl, event_inj, npt_enabled,
> +		    npt_addr, isa),
>  
>  	TP_STRUCT__entry(
>  		__field(	__u64,		rip		)
> @@ -589,7 +591,9 @@ TRACE_EVENT(kvm_nested_vmrun,
>  		__field(	__u64,		nested_rip	)
>  		__field(	__u32,		int_ctl		)
>  		__field(	__u32,		event_inj	)
> -		__field(	bool,		npt		)
> +		__field(	bool,		npt_enabled	)

s/npt_enabled/tdp_enabled, or maybe ntdp_enabled?

> +		__field(	__u64,		npt_addr	)

Hmm, either

  s/npt_addr/nested_pgd

or 

  s/npt_addr/guest_pgd

"npt_addr" or "tdp_addr" is too ambiguous, e.g. it can be interpreted as the address
of a TDP page fault.

My vote would be for "guest_pgd" and then pass in the non-nested CR3 when L1 isn't
using nTDP.

> +		__field(	__u32,		isa		)
>  	),
>  
>  	TP_fast_assign(
> @@ -598,14 +602,21 @@ TRACE_EVENT(kvm_nested_vmrun,
>  		__entry->nested_rip	= nested_rip;
>  		__entry->int_ctl	= int_ctl;
>  		__entry->event_inj	= event_inj;
> -		__entry->npt		= npt;
> +		__entry->npt_enabled	= npt_enabled;
> +		__entry->npt_addr	= npt_addr;
> +		__entry->isa		= isa;
>  	),
>  
> -	TP_printk("rip: 0x%016llx vmcb: 0x%016llx nrip: 0x%016llx int_ctl: 0x%08x "
> -		  "event_inj: 0x%08x npt: %s",
> -		__entry->rip, __entry->vmcb, __entry->nested_rip,
> -		__entry->int_ctl, __entry->event_inj,
> -		__entry->npt ? "on" : "off")
> +	TP_printk("rip: 0x%016llx %s: 0x%016llx nested rip: 0x%016llx "
> +		  "int_ctl: 0x%08x event_inj: 0x%08x nested %s: 0x%016llx",

This needs to explicitly print "nTDP on/off".  As proposed, "nested ept/npt: %addr"
doesn't capture that.  (a) addr=0 is perfectly legal, and (b) addr!=0 is also legal
if nTDP isn't enabled, i.e. a non-zero nested_cr3/eptp is ignore by hardware

> +		__entry->rip,
> +		__entry->isa == KVM_ISA_VMX ? "vmcs" : "vmcb",
> +		__entry->vmcb,
> +		__entry->nested_rip,
> +		__entry->int_ctl,
> +		__entry->event_inj,
> +		__entry->isa == KVM_ISA_VMX ? "ept" : "npt",
> +		__entry->npt_enabled ? __entry->npt_addr : 0x0)
>  );
>  
>  TRACE_EVENT(kvm_nested_intercepts,
> -- 
> 2.37.0.170.g444d1eabd0-goog
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ