[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <e5f35fa6-ab8d-695a-6c66-bfb2b2465b2a@huawei.com>
Date: Fri, 12 Apr 2019 17:34:53 +0800
From: Zenghui Yu <yuzenghui@...wei.com>
To: Suzuki K Poulose <suzuki.poulose@....com>,
<linux-arm-kernel@...ts.infradead.org>
CC: <linux-kernel@...r.kernel.org>, <kvm@...r.kernel.org>,
<kvmarm@...ts.cs.columbia.edu>, <julien.thierry@....com>,
<christoffer.dall@....com>, <marc.zyngier@....com>,
<andrew.murray@....com>, <eric.auger@...hat.com>,
<zhengxiang9@...wei.com>, <wanghaibin.wang@...wei.com>
Subject: Re: [PATCH 2/2] kvm: arm: Unify handling THP backed host memory
On 2019/4/11 23:16, Suzuki K Poulose wrote:
> Hi Zhengui,
>
> On 11/04/2019 02:59, Zenghui Yu wrote:
>> Hi Suzuki,
>>
>> On 2019/4/10 23:23, Suzuki K Poulose wrote:
>>> We support mapping host memory backed by PMD transparent hugepages
>>> at stage2 as huge pages. However the checks are now spread across
>>> two different places. Let us unify the handling of the THPs to
>>> keep the code cleaner (and future proof for PUD THP support).
>>> This patch moves transparent_hugepage_adjust() closer to the caller
>>> to avoid a forward declaration for
>>> fault_supports_stage2_huge_mappings().
>>>
>>> Also, since we already handle the case where the host VA and the guest
>>> PA may not be aligned, the explicit VM_BUG_ON() is not required.
>>>
>>> Cc: Marc Zyngier <marc.zyngier@....com>
>>> Cc: Christoffer Dall <christoffer.dall@....com>
>>> Cc: Zneghui Yu <yuzenghui@...wei.com>
>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@....com>
>>> ---
>>> virt/kvm/arm/mmu.c | 123
>>> +++++++++++++++++++++++++++--------------------------
>>> 1 file changed, 62 insertions(+), 61 deletions(-)
>>>
>>> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
>>> index 6d73322..714eec2 100644
>>> --- a/virt/kvm/arm/mmu.c
>>> +++ b/virt/kvm/arm/mmu.c
>>> @@ -1380,53 +1380,6 @@ int kvm_phys_addr_ioremap(struct kvm *kvm,
>>> phys_addr_t guest_ipa,
>>> return ret;
>>> }
>>> -static bool transparent_hugepage_adjust(kvm_pfn_t *pfnp, phys_addr_t
>>> *ipap)
>>> -{
>>> - kvm_pfn_t pfn = *pfnp;
>>> - gfn_t gfn = *ipap >> PAGE_SHIFT;
>>> - struct page *page = pfn_to_page(pfn);
>>> -
>>> - /*
>>> - * PageTransCompoundMap() returns true for THP and
>>> - * hugetlbfs. Make sure the adjustment is done only for THP
>>> - * pages.
>>> - */
>>> - if (!PageHuge(page) && PageTransCompoundMap(page)) {
>>> - unsigned long mask;
>>> - /*
>>> - * The address we faulted on is backed by a transparent huge
>>> - * page. However, because we map the compound huge page and
>>> - * not the individual tail page, we need to transfer the
>>> - * refcount to the head page. We have to be careful that the
>>> - * THP doesn't start to split while we are adjusting the
>>> - * refcounts.
>>> - *
>>> - * We are sure this doesn't happen, because mmu_notifier_retry
>>> - * was successful and we are holding the mmu_lock, so if this
>>> - * THP is trying to split, it will be blocked in the mmu
>>> - * notifier before touching any of the pages, specifically
>>> - * before being able to call __split_huge_page_refcount().
>>> - *
>>> - * We can therefore safely transfer the refcount from PG_tail
>>> - * to PG_head and switch the pfn from a tail page to the head
>>> - * page accordingly.
>>> - */
>>> - mask = PTRS_PER_PMD - 1;
>>> - VM_BUG_ON((gfn & mask) != (pfn & mask));
>>> - if (pfn & mask) {
>>> - *ipap &= PMD_MASK;
>>> - kvm_release_pfn_clean(pfn);
>>> - pfn &= ~mask;
>>> - kvm_get_pfn(pfn);
>>> - *pfnp = pfn;
>>> - }
>>> -
>>> - return true;
>>> - }
>>> -
>>> - return false;
>>> -}
>>> -
>>> /**
>>> * stage2_wp_ptes - write protect PMD range
>>> * @pmd: pointer to pmd entry
>>> @@ -1677,6 +1630,61 @@ static bool
>>> fault_supports_stage2_huge_mapping(struct kvm_memory_slot *memslot,
>>> (hva & ~(map_size - 1)) + map_size <= uaddr_end;
>>> }
>>> +/*
>>> + * Check if the given hva is backed by a transparent huge page (THP)
>>> + * and whether it can be mapped using block mapping in stage2. If
>>> so, adjust
>>> + * the stage2 PFN and IPA accordingly. Only PMD_SIZE THPs are currently
>>> + * supported. This will need to be updated to support other THP sizes.
>>> + *
>>> + * Returns the size of the mapping.
>>> + */
>>> +static unsigned long
>>> +transparent_hugepage_adjust(struct kvm_memory_slot *memslot,
>>> + unsigned long hva, kvm_pfn_t *pfnp,
>>> + phys_addr_t *ipap)
>>> +{
>>> + kvm_pfn_t pfn = *pfnp;
>>> + struct page *page = pfn_to_page(pfn);
>>> +
>>> + /*
>>> + * PageTransCompoundMap() returns true for THP and
>>> + * hugetlbfs. Make sure the adjustment is done only for THP
>>> + * pages. Also make sure that the HVA and IPA are sufficiently
>>> + * aligned and that the block map is contained within the memslot.
>>> + */
>>> + if (!PageHuge(page) && PageTransCompoundMap(page) &&
>>
>> We managed to get here, ensure that we only play with normal size pages
>> and no hugetlbfs pages will be involved. "!PageHuge(page)" will always
>> return true and we can let it go.
>
> I think that is a bit tricky. If someone ever modifies the user_mem_abort()
> and we end up in getting called with a HugeTLB backed page things could go
> wrong.
That will be bad. I'm not sure if it's possible in the future.
> I could do remove the check, but would like to add a WARN_ON_ONCE() to make
> sure our assumption is held.
>
> i.e,
> WARN_ON_ONCE(PageHuge(page));
But this is a careful approach. I think this will be valuable both for
developers and the code itself. Thanks!
zenghui
>
> if (PageTransCompoundMap(page) &&>> +
> fault_supports_stage2_huge_mapping(memslot, hva, PMD_SIZE)) {
>
> ...
>
Powered by blists - more mailing lists