lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 29 Jun 2010 16:07:40 +0800
From:	Xiao Guangrong <xiaoguangrong@...fujitsu.com>
To:	Marcelo Tosatti <mtosatti@...hat.com>
CC:	Avi Kivity <avi@...hat.com>, LKML <linux-kernel@...r.kernel.org>,
	KVM list <kvm@...r.kernel.org>
Subject: Re: [PATCH v2 8/10] KVM: MMU: prefetch ptes when intercepted guest
 #PF



Marcelo Tosatti wrote:

>> +
>> +	if (sp->role.level > PT_PAGE_TABLE_LEVEL)
>> +		return;
>> +
>> +	if (sp->role.direct)
>> +		return direct_pte_prefetch(vcpu, sptep);
> 
> Can never happen.
> 

Marcelo,

Thanks for your comment. You mean that we can't meet sp->role.direct here?
could you please tell me why? During my test, it can be triggered.


>> @@ -322,6 +395,7 @@ static u64 *FNAME(fetch)(struct kvm_vcpu *vcpu, gva_t addr,
>>  				     user_fault, write_fault,
>>  				     dirty, ptwrite, level,
>>  				     gw->gfn, pfn, false, true);
>> +			FNAME(pte_prefetch)(vcpu, sptep);
>>  			break;
>>  		}
> 
> 
> I'm afraid this can introduce regressions since it increases mmu_lock
> contention. Can you get some numbers with 4-vcpu or 8-vcpu guest and
> many threads benchmarks, such as kernbench and apachebench? (on
> non-EPT).
> 

The pte prefetch is the fast path, it only occupies little time, for the worst
case, only need read 128 byte form the guest pte, and if it prefetched success,
the #PF cause by later access will avoid, then we avoid to exit form the guest,
and walk guest pte, walk shadow pages, flush local tlb... a lots of work can be
reduced.

Before i post this patchset firstly, i do the performance test by using unixbench,
it improved ~3.6% under EPT disable case.
(it's in the first version's chagelog)

Today, i do the kernbench test with 4 vcpu and 1G memory, the result shows it
improved ~1.6% :-)

> Also prefetch should be disabled for EPT, due to lack of accessed bit.
> 

But we call mmu_set_spte() with speculative == false, it not touch the accessed bit.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ