linux-kernel - Re: [PATCH v3 1/2] KVM: x86: Check hypercall's exit to userspace generically

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ca1eab63d443c2c92c367cee418ae969ba90d6cd.camel@intel.com>
Date: Fri, 1 Nov 2024 21:13:01 +0000
From: "Huang, Kai" <kai.huang@...el.com>
To: "seanjc@...gle.com" <seanjc@...gle.com>
CC: "Edgecombe, Rick P" <rick.p.edgecombe@...el.com>,
	"binbin.wu@...ux.intel.com" <binbin.wu@...ux.intel.com>, "Li, Xiaoyao"
	<xiaoyao.li@...el.com>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
	"pbonzini@...hat.com" <pbonzini@...hat.com>, "Yamahata, Isaku"
	<isaku.yamahata@...el.com>, "yuan.yao@...ux.intel.com"
	<yuan.yao@...ux.intel.com>
Subject: Re: [PATCH v3 1/2] KVM: x86: Check hypercall's exit to userspace
 generically

On Fri, 2024-11-01 at 09:39 -0700, Sean Christopherson wrote:
> On Fri, Nov 01, 2024, Kai Huang wrote:
> > On Thu, 2024-10-31 at 07:54 -0700, Sean Christopherson wrote:
> > > On Thu, Oct 31, 2024, Kai Huang wrote:
> > > -	ret = __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl);
> > > -	if (nr == KVM_HC_MAP_GPA_RANGE && !ret)
> > > -		/* MAP_GPA tosses the request to the user space. */
> > > -		return 0;
> > > +	r = __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, op_64_bit, cpl, &ret);
> > > +	if (r <= r)
> > > +		return r;
> > 
> > ... should be:
> > 
> > 	if (r <= 0)
> > 		return r;
> > 
> > ?
> > 
> > Another option might be we move "set hypercall return value" code inside
> > __kvm_emulate_hypercall().  So IIUC the reason to split
> > __kvm_emulate_hypercall() out is for TDX, and while non-TDX uses RAX to carry
> > the hypercall return value, TDX uses R10.
> > 
> > We can additionally pass a "kvm_hypercall_set_ret_func" function pointer to
> > __kvm_emulate_hypercall(), and invoke it inside.  Then we can change
> > __kvm_emulate_hypercall() to return: 
> >     < 0 error, 
> >     ==0 return to userspace, 
> >     > 0 go back to guest.
> 
> Hmm, and the caller can still handle kvm_skip_emulated_instruction(), because the
> return value is KVM's normal pattern.
> 
> I like it!
> 
> But, there's no need to pass a function pointer, KVM can write (and read) arbitrary
> GPRs, it's just avoided in most cases so that the sanity checks and available/dirty
> updates are elided.  For this code though, it's easy enough to keep kvm_rxx_read()
> for getting values, and eating the overhead of a single GPR write is a perfectly
> fine tradeoff for eliminating the return multiplexing.
> 
> Lightly tested.  Assuming this works for TDX and passes testing, I'll post a
> mini-series next week.
> 
> --
> From: Sean Christopherson <seanjc@...gle.com>
> Date: Fri, 1 Nov 2024 09:04:00 -0700
> Subject: [PATCH] KVM: x86: Refactor __kvm_emulate_hypercall() to accept reg
>  names, not values
> 
> Rework __kvm_emulate_hypercall() to take the names of input and output
> (guest return value) registers, as opposed to taking the input values and
> returning the output value.  As part of the refactor, change the actual
> return value from __kvm_emulate_hypercall() to be KVM's de facto standard
> of '0' == exit to userspace, '1' == resume guest, and -errno == failure.
> 
> Using the return value for KVM's control flow eliminates the multiplexed
> return value, where '0' for KVM_HC_MAP_GPA_RANGE (and only that hypercall)
> means "exit to userspace".
> 
> Use the direct GPR accessors to read values to avoid the pointless marking
> of the registers as available, but use kvm_register_write_raw() for the
> guest return value so that the innermost helper doesn't need to multiplex
> its return value.  Using the generic kvm_register_write_raw() adds very
> minimal overhead, so as a one-off in a relatively slow path it's well
> worth the code simplification.

Ah right :-)

> 
> Suggested-by: Kai Huang <kai.huang@...el.com>
> Signed-off-by: Sean Christopherson <seanjc@...gle.com>
> ---

I think Binbin can help to test on TDX, and assuming it works,

Reviewed-by: Kai Huang <kai.huang@...el.com>