[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20140314182331.GA11482@redhat.com>
Date: Fri, 14 Mar 2014 19:23:31 +0100
From: Oleg Nesterov <oleg@...hat.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Gleb Natapov <gleb@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Davidlohr Bueso <davi@...hat.com>,
Davidlohr Bueso <davidlohr@...com>,
KOSAKI Motohiro <kosaki.motohiro@...il.com>,
Rik van Riel <riel@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Mel Gorman <mgorman@...e.de>,
Michel Lespinasse <walken@...gle.com>,
Ingo Molnar <mingo@...nel.org>
Subject: Re: async_pf.c && use_mm() (Was: mm,vmacache: also flush cache for
VM_CLONE)
On 03/13, Linus Torvalds wrote:
>
> Ok, no longer on my phone, and no, it clearly does the reference count with a
>
> atomic_inc(&work->mm->mm_count);
>
> separately. The use_mm/unuse_mm seems entirely specious.
Yes, it really looks as if we can simply remove it.
But once again, with or without use_mm() it seems that the refcounting
is buggy. get_user_pages() is simply wrong if ->mm_users == 0 and
exit_mmap/etc was already called (or in progress).
So I think we need something like below, but I can't test this change
or audit other (potential) users of kvm_async_pf->mm.
Perhaps this is not a bug and somehow it is guaranteed that, say,
kvm_clear_async_pf_completion_queue() must be always called before the
caller of kvm_setup_async_pf() can exit? I don't know, but in this case
we do not need any accounting and this should be documented.
Gleb, what do you think?
Oleg.
--- x/virt/kvm/async_pf.c
+++ x/virt/kvm/async_pf.c
@@ -65,11 +65,9 @@ static void async_pf_execute(struct work_struct *work)
might_sleep();
- use_mm(mm);
down_read(&mm->mmap_sem);
get_user_pages(current, mm, addr, 1, 1, 0, NULL, NULL);
up_read(&mm->mmap_sem);
- unuse_mm(mm);
spin_lock(&vcpu->async_pf.lock);
list_add_tail(&apf->link, &vcpu->async_pf.done);
@@ -85,7 +83,7 @@ static void async_pf_execute(struct work_struct *work)
if (waitqueue_active(&vcpu->wq))
wake_up_interruptible(&vcpu->wq);
- mmdrop(mm);
+ mmput(mm);
kvm_put_kvm(vcpu->kvm);
}
@@ -98,7 +96,7 @@ void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu)
typeof(*work), queue);
list_del(&work->queue);
if (cancel_work_sync(&work->work)) {
- mmdrop(work->mm);
+ mmput(work->mm);
kvm_put_kvm(vcpu->kvm); /* == work->vcpu->kvm */
kmem_cache_free(async_pf_cache, work);
}
@@ -162,7 +160,7 @@ int kvm_setup_async_pf(struct kvm_vcpu *vcpu, gva_t gva, gfn_t gfn,
work->addr = gfn_to_hva(vcpu->kvm, gfn);
work->arch = *arch;
work->mm = current->mm;
- atomic_inc(&work->mm->mm_count);
+ atomic_inc(&work->mm->mm_users);
kvm_get_kvm(work->vcpu->kvm);
/* this can't really happen otherwise gfn_to_pfn_async
@@ -180,7 +178,7 @@ int kvm_setup_async_pf(struct kvm_vcpu *vcpu, gva_t gva, gfn_t gfn,
return 1;
retry_sync:
kvm_put_kvm(work->vcpu->kvm);
- mmdrop(work->mm);
+ mmput(work->mm);
kmem_cache_free(async_pf_cache, work);
return 0;
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists