[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <53F2C997.6070605@linux.vnet.ibm.com>
Date: Tue, 19 Aug 2014 11:50:47 +0800
From: Xiao Guangrong <xiaoguangrong@...ux.vnet.ibm.com>
To: David Matlack <dmatlack@...gle.com>
CC: Paolo Bonzini <pbonzini@...hat.com>,
Gleb Natapov <gleb@...nel.org>,
Avi Kivity <avi.kivity@...il.com>, mtosatti@...hat.com,
linux-kernel@...r.kernel.org, kvm@...r.kernel.org,
stable@...r.kernel.org
Subject: Re: [PATCH 1/2] KVM: fix cache stale memslot info with correct mmio
generation number
On 08/19/2014 05:15 AM, David Matlack wrote:
> On Mon, Aug 18, 2014 at 12:56 PM, Xiao Guangrong
> <xiaoguangrong.eric@...il.com> wrote:
>> @@ -287,9 +293,15 @@ static bool set_mmio_spte(struct kvm *kvm, u64 *sptep, gfn_t gfn,
>>
>> static bool check_mmio_spte(struct kvm *kvm, u64 spte)
>> {
>> + struct kvm_memslots *slots = kvm_memslots(kvm);
>> unsigned int kvm_gen, spte_gen;
>>
>> - kvm_gen = kvm_current_mmio_generation(kvm);
>> + if (slots->updated)
>> + return false;
>> +
>> + smp_rmb();
>> +
>> + kvm_gen = __kvm_current_mmio_generation(slots);
>> spte_gen = get_mmio_spte_generation(spte);
>>
>
> What does this fix? Case 2 can still happen. (Case 2 is unavoidable unless we
> block during memslot updates, which I don't think we should :).
This exactly fixes case 2, slots->updated just acts as the "low bit"
but avoid generation number wrap-around and trick handling of the number.
More details please see below.
>
>> trace_check_mmio_spte(spte, kvm_gen, spte_gen);
>> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>> index 4b6c01b..1d4e78f 100644
>> --- a/virt/kvm/kvm_main.c
>> +++ b/virt/kvm/kvm_main.c
>> @@ -96,7 +96,7 @@ static void hardware_disable_all(void);
>>
>> static void kvm_io_bus_destroy(struct kvm_io_bus *bus);
>> static void update_memslots(struct kvm_memslots *slots,
>> - struct kvm_memory_slot *new, u64 last_generation);
>> + struct kvm_memory_slot *new);
>>
>> static void kvm_release_pfn_dirty(pfn_t pfn);
>> static void mark_page_dirty_in_slot(struct kvm *kvm,
>> @@ -685,8 +685,7 @@ static void sort_memslots(struct kvm_memslots *slots)
>> }
>>
>> static void update_memslots(struct kvm_memslots *slots,
>> - struct kvm_memory_slot *new,
>> - u64 last_generation)
>> + struct kvm_memory_slot *new)
>> {
>> if (new) {
>> int id = new->id;
>> @@ -697,8 +696,6 @@ static void update_memslots(struct kvm_memslots *slots,
>> if (new->npages != npages)
>> sort_memslots(slots);
>> }
>> -
>> - slots->generation = last_generation + 1;
>> }
>>
>> static int check_memory_region_flags(struct kvm_userspace_memory_region *mem)
>> @@ -720,10 +717,17 @@ static struct kvm_memslots *install_new_memslots(struct kvm *kvm,
>> {
>> struct kvm_memslots *old_memslots = kvm->memslots;
>>
>> - update_memslots(slots, new, kvm->memslots->generation);
>> + /* ensure generation number is always increased. */
>> + slots->updated = true;
>> + slots->generation = old_memslots->generation;
>> + update_memslots(slots, new);
>> rcu_assign_pointer(kvm->memslots, slots);
>> synchronize_srcu_expedited(&kvm->srcu);
>>
>> + slots->generation++;
>> + smp_wmb();
>> + slots->updated = false;
>> +
>> kvm_arch_memslots_updated(kvm);
>>
>> return old_memslots;
>>
>
> This is effectively the same as the first approach.
>
> I just realized how simple Paolo's idea is. I think it can be a one line
> patch (without comments):
>
> [...]
> update_memslots(slots, new, kvm->memslots->generation);
> rcu_assign_pointer(kvm->memslots, slots);
> synchronize_srcu_expedited(&kvm->srcu);
> + slots->generation++;
>
> kvm_arch_memslots_updated(kvm);
> [...]
Really? Unfortunately no. :)
See this scenario:
CPU 0 CPU 1
ioctl registering a new memslot which
contains GPA:
page-fault handler:
see it'a mmio access on GPA;
assign the new memslots with generation number increased
cache the generation-number into spte;
fix the access and comeback to guest;
SRCU-sync
page-fault again and check the spte is a valid mmio-spte(*)
generation-number++;
return to userspace;
do mmio-emulation and inject mmio-exit;
!!! userspace receives a unexpected mmio-exit, that is case 2 i exactly
said in the last mail.
Note in the step *, my approach detects the invalid generation-number which
will invalidate the mmio spte properly .
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists