linux-kernel - Re: [PATCH v5 09/12] Retry fault before vmentry

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20100824093356.GY10499@redhat.com>
Date:	Tue, 24 Aug 2010 12:33:56 +0300
From:	Gleb Natapov <gleb@...hat.com>
To:	Avi Kivity <avi@...hat.com>
Cc:	kvm@...r.kernel.org, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org, mingo@...e.hu,
	a.p.zijlstra@...llo.nl, tglx@...utronix.de, hpa@...or.com,
	riel@...hat.com, cl@...ux-foundation.org, mtosatti@...hat.com
Subject: Re: [PATCH v5 09/12] Retry fault before vmentry

On Tue, Aug 24, 2010 at 12:25:33PM +0300, Avi Kivity wrote:
>  On 07/19/2010 06:30 PM, Gleb Natapov wrote:
> >When page is swapped in it is mapped into guest memory only after guest
> >tries to access it again and generate another fault. To save this fault
> >we can map it immediately since we know that guest is going to access
> >the page.
> >
> >
> >
> >-static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa,
> >-				u32 error_code)
> >+static int tdp_page_fault(struct kvm_vcpu *vcpu, gva_t gpa, u32 error_code,
> >+			  bool sync)
> 
> 'sync' means something else in the shadow mmu.  Please rename to
> something longer, maybe 'apf_completion'.
> 
> Alternatively, split to two functions, a base function that doesn't
> do apf and a wrapper that handles apf.
> 
Will rename to something else.

> >@@ -505,6 +506,37 @@ out_unlock:
> >  	return 0;
> >  }
> >
> >+static int FNAME(page_fault_other_cr3)(struct kvm_vcpu *vcpu, gpa_t cr3,
> >+				       gva_t addr, u32 error_code)
> >+{
> >+	int r = 0;
> >+	gpa_t curr_cr3 = vcpu->arch.cr3;
> >+
> >+	if (curr_cr3 != cr3) {
> >+		/*
> >+		 * We do page fault on behalf of a process that is sleeping
> >+		 * because of async PF. PV guest takes reference to mm that cr3
> >+		 * belongs too, so it has to be valid here.
> >+		 */
> >+		kvm_set_cr3(vcpu, cr3);
> >+		if (kvm_mmu_reload(vcpu))
> >+			goto switch_cr3;
> >+	}
> 
> With nested virtualization, we need to switch cr0, cr4, and efer as well...
> 
On SVM or VMX or both?

> >+
> >+	r = FNAME(page_fault)(vcpu, addr, error_code, true);
> >+
> >+	if (kvm_check_request(KVM_REQ_MMU_SYNC, vcpu))
> >+		kvm_mmu_sync_roots(vcpu);
> 
> Why is this needed?
> 
http://www.mail-archive.com/kvm@vger.kernel.org/msg37827.html

 KVM_REQ_MMU_SYNC request generated here must be processed before
 switching to a different cr3 (otherwise vcpu_enter_guest will process it 
 with the wrong cr3 in place).


> >+
> >+switch_cr3:
> >+	if (curr_cr3 != vcpu->arch.cr3) {
> >+		kvm_set_cr3(vcpu, curr_cr3);
> >+		kvm_mmu_reload(vcpu);
> >+	}
> >+
> >+	return r;
> >+}
> 
> This has the nasty effect of flushing the TLB on AMD.
> 
What is more expansive reenter the guest and handle one more fault, or
flash TLB here?

> >+
> >  static void FNAME(invlpg)(struct kvm_vcpu *vcpu, gva_t gva)
> >  {
> >  	struct kvm_shadow_walk_iterator iterator;
> >diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> >index 2603cc4..5482db0 100644
> >--- a/arch/x86/kvm/x86.c
> >+++ b/arch/x86/kvm/x86.c
> >@@ -5743,6 +5743,15 @@ void kvm_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags)
> >  }
> >  EXPORT_SYMBOL_GPL(kvm_set_rflags);
> >
> >+void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu,
> >+			       struct kvm_async_pf *work)
> >+{
> >+	if (!vcpu->arch.mmu.page_fault_other_cr3 || is_error_page(work->page))
> >+		return;
> >+	vcpu->arch.mmu.page_fault_other_cr3(vcpu, work->arch.cr3, work->gva,
> >+					    work->arch.error_code);
> >+}
> >+
> >  static int apf_put_user(struct kvm_vcpu *vcpu, u32 val)
> >  {
> >  	if (unlikely(vcpu->arch.apf_memslot_ver !=
> >diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> >index f56e8ac..de1d5b6 100644
> >--- a/virt/kvm/kvm_main.c
> >+++ b/virt/kvm/kvm_main.c
> >@@ -1348,6 +1348,7 @@ void kvm_check_async_pf_completion(struct kvm_vcpu *vcpu)
> >  			spin_lock(&vcpu->async_pf_lock);
> >  			list_del(&work->link);
> >  			spin_unlock(&vcpu->async_pf_lock);
> >+			kvm_arch_async_page_ready(vcpu, work);
> >  			put_page(work->page);
> >  			async_pf_work_free(work);
> >  			list_del(&work->queue);
> >@@ -1366,6 +1367,7 @@ void kvm_check_async_pf_completion(struct kvm_vcpu *vcpu)
> >  	list_del(&work->queue);
> >  	vcpu->async_pf_queued--;
> >
> >+	kvm_arch_async_page_ready(vcpu, work);
> >  	kvm_arch_inject_async_page_present(vcpu, work);
> >
> >  	put_page(work->page);
> 
> 
> -- 
> error compiling committee.c: too many arguments to function

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/