linux-kernel - Re: [PATCH 11/11] KVM: MMU: improve write flooding detected

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4E2FE674.8070202@cn.fujitsu.com>
Date:	Wed, 27 Jul 2011 18:20:36 +0800
From:	Xiao Guangrong <xiaoguangrong@...fujitsu.com>
To:	Avi Kivity <avi@...hat.com>
CC:	Marcelo Tosatti <mtosatti@...hat.com>,
	LKML <linux-kernel@...r.kernel.org>, KVM <kvm@...r.kernel.org>
Subject: Re: [PATCH 11/11] KVM: MMU: improve write flooding detected

On 07/27/2011 05:23 PM, Avi Kivity wrote:
> On 07/26/2011 02:32 PM, Xiao Guangrong wrote:
>> Detecting write-flooding does not work well, when we handle page written, if
>> the last speculative spte is not accessed, we treat the page is
>> write-flooding, however, we can speculative spte on many path, such as pte
>> prefetch, page synced, that means the last speculative spte may be not point
>> to the written page and the written page can be accessed via other sptes, so
>> depends on the Accessed bit of the last speculative spte is not enough
>>
>> Instead of detected page accessed, we can detect whether the spte is accessed
>> or not, if the spte is not accessed but it is written frequently, we treat is
>> not a page table or it not used for a long time
>>
>>   static int get_free_pte_list_desc_nr(struct kvm_vcpu *vcpu)
>>   {
>>       struct kvm_mmu_memory_cache *cache;
>> @@ -3565,22 +3547,14 @@ static u64 mmu_pte_write_fetch_gpte(struct kvm_vcpu *vcpu, gpa_t *gpa,
>>    * If we're seeing too many writes to a page, it may no longer be a page table,
>>    * or we may be forking, in which case it is better to unmap the page.
>>    */
>> -static bool detect_write_flooding(struct kvm_vcpu *vcpu, gfn_t gfn)
>> +static bool detect_write_flooding(struct kvm_mmu_page *sp, u64 *spte)
>>   {
>> -    bool flooded = false;
>> -
>> -    if (gfn == vcpu->arch.last_pt_write_gfn
>> -    &&  !last_updated_pte_accessed(vcpu)) {
>> -        ++vcpu->arch.last_pt_write_count;
>> -        if (vcpu->arch.last_pt_write_count>= 3)
>> -            flooded = true;
>> -    } else {
>> -        vcpu->arch.last_pt_write_gfn = gfn;
>> -        vcpu->arch.last_pt_write_count = 1;
>> -        vcpu->arch.last_pte_updated = NULL;
>> -    }
>> +    if (spte&&  !(*spte&  shadow_accessed_mask))
>> +        sp->write_flooding_count++;
>> +    else
>> +        sp->write_flooding_count = 0;
>>
>> -    return flooded;
>> +    return sp->write_flooding_count>= 3;
>>   }
> 
> I think this is a little dangerous.  A guest kernel may be instantiating multiple gptes on a page fault, but guest userspace hits only one of them (the one which caused the page fault) - I think Windows does this, but I'm not sure.
> 

I think this case is not bad: if the guest kernel need to write multiple gptes (>=3),
it will cause many page fault, we do better zap the shadow page and let it become writable as
soon as possible.
(And, we have pte-fetch, it can quickly establish the mapping for a new shadow page)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/