linux-kernel - Re: async_pf.c && use_mm() (Was: mm,vmacache: also flush cache for VM

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <20140314182331.GA11482@redhat.com>
Date:	Fri, 14 Mar 2014 19:23:31 +0100
From:	Oleg Nesterov <oleg@...hat.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
Cc:	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Gleb Natapov <gleb@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Davidlohr Bueso <davi@...hat.com>,
	Davidlohr Bueso <davidlohr@...com>,
	KOSAKI Motohiro <kosaki.motohiro@...il.com>,
	Rik van Riel <riel@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Mel Gorman <mgorman@...e.de>,
	Michel Lespinasse <walken@...gle.com>,
	Ingo Molnar <mingo@...nel.org>
Subject: Re: async_pf.c && use_mm() (Was: mm,vmacache: also flush cache for
	VM_CLONE)

On 03/13, Linus Torvalds wrote:
>
> Ok, no longer on my phone, and no, it clearly does the reference count with a
>
>     atomic_inc(&work->mm->mm_count);
>
> separately. The use_mm/unuse_mm seems entirely specious.

Yes, it really looks as if we can simply remove it.

But once again, with or without use_mm() it seems that the refcounting
is buggy. get_user_pages() is simply wrong if ->mm_users == 0 and
exit_mmap/etc was already called (or in progress).

So I think we need something like below, but I can't test this change
or audit other (potential) users of kvm_async_pf->mm.

Perhaps this is not a bug and somehow it is guaranteed that, say,
kvm_clear_async_pf_completion_queue() must be always called before the
caller of kvm_setup_async_pf() can exit? I don't know, but in this case
we do not need any accounting and this should be documented.

Gleb, what do you think?

Oleg.

--- x/virt/kvm/async_pf.c
+++ x/virt/kvm/async_pf.c
@@ -65,11 +65,9 @@ static void async_pf_execute(struct work_struct *work)
 
 	might_sleep();
 
-	use_mm(mm);
 	down_read(&mm->mmap_sem);
 	get_user_pages(current, mm, addr, 1, 1, 0, NULL, NULL);
 	up_read(&mm->mmap_sem);
-	unuse_mm(mm);
 
 	spin_lock(&vcpu->async_pf.lock);
 	list_add_tail(&apf->link, &vcpu->async_pf.done);
@@ -85,7 +83,7 @@ static void async_pf_execute(struct work_struct *work)
 	if (waitqueue_active(&vcpu->wq))
 		wake_up_interruptible(&vcpu->wq);
 
-	mmdrop(mm);
+	mmput(mm);
 	kvm_put_kvm(vcpu->kvm);
 }
 
@@ -98,7 +96,7 @@ void kvm_clear_async_pf_completion_queue(struct kvm_vcpu *vcpu)
 				   typeof(*work), queue);
 		list_del(&work->queue);
 		if (cancel_work_sync(&work->work)) {
-			mmdrop(work->mm);
+			mmput(work->mm);
 			kvm_put_kvm(vcpu->kvm); /* == work->vcpu->kvm */
 			kmem_cache_free(async_pf_cache, work);
 		}
@@ -162,7 +160,7 @@ int kvm_setup_async_pf(struct kvm_vcpu *vcpu, gva_t gva, gfn_t gfn,
 	work->addr = gfn_to_hva(vcpu->kvm, gfn);
 	work->arch = *arch;
 	work->mm = current->mm;
-	atomic_inc(&work->mm->mm_count);
+	atomic_inc(&work->mm->mm_users);
 	kvm_get_kvm(work->vcpu->kvm);
 
 	/* this can't really happen otherwise gfn_to_pfn_async
@@ -180,7 +178,7 @@ int kvm_setup_async_pf(struct kvm_vcpu *vcpu, gva_t gva, gfn_t gfn,
 	return 1;
 retry_sync:
 	kvm_put_kvm(work->vcpu->kvm);
-	mmdrop(work->mm);
+	mmput(work->mm);
 	kmem_cache_free(async_pf_cache, work);
 	return 0;
 }

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/