[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120528143221.GF4016@redhat.com>
Date: Mon, 28 May 2012 16:32:21 +0200
From: Andrea Arcangeli <aarcange@...hat.com>
To: Avi Kivity <avi@...hat.com>
Cc: Xiao Guangrong <xiaoguangrong@...ux.vnet.ibm.com>,
Marcelo Tosatti <mtosatti@...hat.com>,
LKML <linux-kernel@...r.kernel.org>, KVM <kvm@...r.kernel.org>
Subject: Re: [PATCH] KVM: MMU: fix huge page adapted on non-PAE host
Hi,
On Mon, May 28, 2012 at 04:53:38PM +0300, Avi Kivity wrote:
> As far as I can tell __get_user_pages_fast() will take the reference
> count in the page head in the first place.
mask = KVM_PAGES_PER_HPAGE(level) - 1;
The BUG would trigger if the above KVM mask is 2M (that is the NPT/EPT
pmd size), but the hugepage size in the host is 4M (noPAE 32bit).
The refcount is taken only in the head page for heads, and in both for
tails.
Because we've mmu notifier, we never keep the pages mapped by sptes
refcounted, we drop them all. So all we need to do is just to move the
refcount on the same exact pfn that is then freed by mmu_set_spte
(kvm_release_pfn_clean at the end).
The adjustement is not done for the refcounting, the issue here is, we
want to adjust the "pfn" passed to mmu_set_spte, and in turn we've to
move the refcounting too, because the kvm_release_pfn_clean will run
on that "pfn" (not on the pfn returned by gup-fast anymore).
So it looks fine to just do get_page and the patch looks correct (not
sure if the mmio the mmio check is needed or if we can just do
get_page) as long as the "pfn" that is returned through &pfn parameter
and then passssed to mmu_set_sptes is the same one were we do get_page.
The reason it was a get_page_unless_zero() is that it wanted to check
that there was no THP split and the head page was still there. Problem
is that with a 4M host page size and 2M NTP/EPT pmd size, we need to
get_page a tail page half of the time, and get_page_unless_zero()
won't be a correct refcount for tail pages, not equivalent to a full
get_page.
Overall the most important thing is that the pfn returned is the
correct one that matches the alignment of the NPT/EPT hugepmd size,
the refcounting just closely follows that aligned "pfn".
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists