linux-kernel - Re: [RFC PATCH v2 02/18] KVM: x86/mmu: Add dedicated API to map guest

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <0a7785b3e985ec98b7f94f149afabdb86efb08d5.camel@intel.com>
Date: Fri, 29 Aug 2025 18:34:39 +0000
From: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>
To: "pbonzini@...hat.com" <pbonzini@...hat.com>, "seanjc@...gle.com"
	<seanjc@...gle.com>
CC: "Huang, Kai" <kai.huang@...el.com>, "ackerleytng@...gle.com"
	<ackerleytng@...gle.com>, "Annapurve, Vishal" <vannapurve@...gle.com>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "Zhao, Yan Y"
	<yan.y.zhao@...el.com>, "Weiny, Ira" <ira.weiny@...el.com>,
	"kvm@...r.kernel.org" <kvm@...r.kernel.org>, "michael.roth@....com"
	<michael.roth@....com>
Subject: Re: [RFC PATCH v2 02/18] KVM: x86/mmu: Add dedicated API to map
 guest_memfd pfn into TDP MMU

On Thu, 2025-08-28 at 17:06 -0700, Sean Christopherson wrote:
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -4994,6 +4994,65 @@ long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu,
>  	return min(range->size, end - range->gpa);
>  }
>  
> +int kvm_tdp_mmu_map_private_pfn(struct kvm_vcpu *vcpu, gfn_t gfn, kvm_pfn_t pfn)
> +{
> +	struct kvm_page_fault fault = {
> +		.addr = gfn_to_gpa(gfn),
> +		.error_code = PFERR_GUEST_FINAL_MASK | PFERR_PRIVATE_ACCESS,
> +		.prefetch = true,
> +		.is_tdp = true,
> +		.nx_huge_page_workaround_enabled = is_nx_huge_page_enabled(vcpu->kvm),

These fault's don't have fault->exec so nx_huge_page_workaround_enabled
shouldn't be a factor. Not a functional issue though. Maybe it is more robust?

> +
> +		.max_level = PG_LEVEL_4K,
> +		.req_level = PG_LEVEL_4K,
> +		.goal_level = PG_LEVEL_4K,
> +		.is_private = true,
> +
> +		.gfn = gfn,
> +		.slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn),
> +		.pfn = pfn,
> +		.map_writable = true,
> +	};
> +	struct kvm *kvm = vcpu->kvm;
> +	int r;
> +
> +	lockdep_assert_held(&kvm->slots_lock);
> +
> +	if (KVM_BUG_ON(!tdp_mmu_enabled, kvm))
> +		return -EIO;
> +
> +	if (kvm_gfn_is_write_tracked(kvm, fault.slot, fault.gfn))
> +		return -EPERM;

If we care about this, why don't we care about the read only memslot flag? TDX
doesn't need this or the nx huge page part above. So this function is more
general.

What about calling it __kvm_tdp_mmu_map_private_pfn() and making it a powerful
"map this pfn at this GFN and don't ask questions" function. Otherwise, I'm not
sure where to draw the line.

> +
> +	r = kvm_mmu_reload(vcpu);
> +	if (r)
> +		return r;
> +
> +	r = mmu_topup_memory_caches(vcpu, false);
> +	if (r)
> +		return r;
> +
> +	do {
> +		if (signal_pending(current))
> +			return -EINTR;
> +
> +		if (kvm_test_request(KVM_REQ_VM_DEAD, vcpu))
> +			return -EIO;
> +
> +		cond_resched();
> +
> +		guard(read_lock)(&kvm->mmu_lock);
> +
> +		r = kvm_tdp_mmu_map(vcpu, &fault);
> +	} while (r == RET_PF_RETRY);
> +
> +	if (r != RET_PF_FIXED)
> +		return -EIO;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(kvm_tdp_mmu_map_private_pfn);