linux-kernel - Re: [PATCH 2/3] KVM: MMU: don not retry #PF for nonpaging guest

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4CD90AE8.7070003@cn.fujitsu.com>
Date:	Tue, 09 Nov 2010 16:48:40 +0800
From:	Xiao Guangrong <xiaoguangrong@...fujitsu.com>
To:	Gleb Natapov <gleb@...hat.com>
CC:	Avi Kivity <avi@...hat.com>, Marcelo Tosatti <mtosatti@...hat.com>,
	LKML <linux-kernel@...r.kernel.org>, KVM <kvm@...r.kernel.org>
Subject: Re: [PATCH 2/3] KVM: MMU: don not retry #PF for nonpaging guest

On 11/09/2010 04:03 PM, Gleb Natapov wrote:
> On Fri, Nov 05, 2010 at 01:39:18PM +0800, Xiao Guangrong wrote:
>> On 11/04/2010 06:35 PM, Gleb Natapov wrote:
>>> On Thu, Nov 04, 2010 at 06:32:42PM +0800, Xiao Guangrong wrote:
>>>> nonpaing guest's 'direct_map' is also true, retry #PF for those
>>>> guests is useless, so use 'tdp_enabled' instead
>>>>
>>> nonpaging guest will not attempt async pf. 
>>
>> Ah, my mistake, but why we can not attempt async pf for nonpaging guest?
>>
>>> And by checking tdp_enabled
>>> here instead of direct_map we will screw nested ntp.
>>>
>>
>> It looks like something broken: apfs can generated in L2 guest (nested ntp guest)
>> and be retried in L1 guest.
>>
> OK now when Xiao explained me the problem privately I can review this.
> 
>> Below patch fix it and let nonpaging guest support async pf. I'll post it properly
>> if you like. :-)
>>
> Apf for nonpaging guest should be separate patch of course.

Sure, i'll separate it when i post.

> 
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index 7f20f2c..606978e 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -600,6 +600,7 @@ struct kvm_x86_ops {
>>  struct kvm_arch_async_pf {
>>  	u32 token;
>>  	gfn_t gfn;
>> +	bool softmmu;
>>  };
>>  
>>  extern struct kvm_x86_ops *kvm_x86_ops;
>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
>> index f3fad4f..48ca312 100644
>> --- a/arch/x86/kvm/mmu.c
>> +++ b/arch/x86/kvm/mmu.c
>>  static int kvm_arch_setup_async_pf(struct kvm_vcpu *vcpu, gva_t gva, gfn_t gfn)
>> @@ -2602,6 +2607,7 @@ static int kvm_arch_setup_async_pf(struct kvm_vcpu *vcpu, gva_t gva, gfn_t gfn)
>>  	struct kvm_arch_async_pf arch;
>>  	arch.token = (vcpu->arch.apf.id++ << 12) | vcpu->vcpu_id;
>>  	arch.gfn = gfn;
>> +	arch.softmmu = mmu_is_softmmu(vcpu);
>>  
> We can do:
>         if (mmu_is_nested(vcpu))
> 		gva = vcpu->mmu.gva_to_gpa(gva);
> And this should fix everything no?
> 

No, since it can't help us to avoid NPF when nested guest run again.

More detailed:

   +--------------+ nested guest VA
L2 | nested guest |
   +--------------+ nested guest PA       +    +
      |                                   |    |
   +--------------+ guest VA              |    |
L1 | guest        |                       |    |
   +--------------+ guest PA          +   |    V
      |                               |   |
   +--------------+ host VA           |   |
L0 | host         |                   |   |
   +--------------+ host PA           V   V
                                    PT10 PT20 PT21

If do it like this way, we will prefault 'gva' in PT10, but L2 guest's
running not depends on it(PT10), the same NPF is avoid only we prefault
it in PT20.

>>  	return kvm_setup_async_pf(vcpu, gva, gfn, &arch);
>>  }
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 2044302..d826d78 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -6172,9 +6172,10 @@ EXPORT_SYMBOL_GPL(kvm_set_rflags);
>>  
>>  void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu, struct kvm_async_pf *work)
>>  {
>> +	bool softmmu = mmu_is_softmmu(vcpu);
>>  	int r;
>>  
>> -	if (!vcpu->arch.mmu.direct_map || is_error_page(work->page))
>> +	if (softmmu || work->arch.softmmu || is_error_page(work->page))
>>  		return;
>>  
>>  	r = kvm_mmu_reload(vcpu);
>> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
>> index 2cea414..48796c7 100644
>> --- a/arch/x86/kvm/x86.h
>> +++ b/arch/x86/kvm/x86.h
>> @@ -55,6 +55,11 @@ static inline bool mmu_is_nested(struct kvm_vcpu *vcpu)
>>  	return vcpu->arch.walk_mmu == &vcpu->arch.nested_mmu;
>>  }
>>  
>> +static inline bool mmu_is_softmmu(struct kvm_vcpu *vcpu)
>> +{
>> +	return !tdp_enabled || mmu_is_nested(vcpu);
>> +}
>> +
> Isn't this the same as checking vcpu->arch.mmu.direct_map? The only
> difference that this will be true for nonpaging mode too but this is
> even better since we can prefault in nonpaging mode.
> 

Um, maybe prefault in nonpaging mode is unreliable, 

For shadow page: it doesn't know the CR0.PG is changed, for example, when apf is
generated CR0.PG = 1 and when apf is completed CR0.PG = 0, it can prefault in
different mmu context.

For nested paging: if nested guest vcpu is nonpaging. it will prefault PT10's NPF
in PT20, it means apf is generated in L1 guest and prefault in L2 guest. 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/