[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e2a94937-c324-e2d6-7e61-3f998e6e6e22@arm.com>
Date: Tue, 12 Mar 2019 11:32:53 +0000
From: Marc Zyngier <marc.zyngier@....com>
To: Zheng Xiang <zhengxiang9@...wei.com>, christoffer.dall@....com,
catalin.marinas@....com, will.deacon@....com,
suzuki.poulose@....com, james.morse@....com
Cc: linux-arm-kernel@...ts.infradead.org, kvmarm@...ts.cs.columbia.edu,
linux-kernel@...r.kernel.org,
Wang Haibin <wanghaibin.wang@...wei.com>,
"yuzenghui@...wei.com" <yuzenghui@...wei.com>,
lious.lilei@...ilicon.com, lishuo1@...ilicon.com
Subject: Re: [RFC] Question about TLB flush while set Stage-2 huge pages
Hi Zheng,
On 11/03/2019 16:31, Zheng Xiang wrote:
> Hi all,
>
> While a page is merged into a transparent huge page, KVM will invalidate Stage-2 for
> the base address of the huge page and the whole of Stage-1.
> However, this just only invalidates the first page within the huge page and the other
> pages are not invalidated, see bellow:
>
> +---------------+--------------+
> |abcde 2MB-Page |
> +---------------+--------------+
>
> TLB before setting new pmd:
> +---------------+--------------+
> | VA | PAGESIZE |
> +---------------+--------------+
> | a | 4KB |
> +---------------+--------------+
> | b | 4KB |
> +---------------+--------------+
> | c | 4KB |
> +---------------+--------------+
> | d | 4KB |
> +---------------+--------------+
>
> TLB after setting new pmd:
> +---------------+--------------+
> | VA | PAGESIZE |
> +---------------+--------------+
> | a | 2MB |
> +---------------+--------------+
> | b | 4KB |
> +---------------+--------------+
> | c | 4KB |
> +---------------+--------------+
> | d | 4KB |
> +---------------+--------------+
>
> When VM access *b* address, it will hit the TLB and result in TLB conflict aborts or other potential exceptions.
That's really bad. I can only imagine two scenarios:
1) We fail to unmap a,b,c,d (and potentially another 508 PTEs), loosing
the PTE table in the process, and place the PMD instead. I can't see
this happening.
2) We fail to invalidate on unmap, and that slightly less bad (but still
quite bad).
Which of the two cases are you seeing?
> For example, we need to keep tracking of the VM memory dirty pages when VM is in live migration.
> KVM will set the memslot READONLY and split the huge pages.
> After live migration is canceled and abort, the pages will be merged into THP.
> The later access to these pages which are READONLY will cause level-3 Permission Fault until they are invalidated.
>
> So should we invalidate the tlb entries for all relative pages(e.g a,b,c,d), like __flush_tlb_range()?
> Or we can call __kvm_tlb_flush_vmid() to invalidate all tlb entries.
We should perform an invalidate on each unmap. unmap_stage2_range seems
to do the right thing. __flush_tlb_range only caters for Stage1
mappings, and __kvm_tlb_flush_vmid() is too big a hammer, as it nukes
TLBs for the whole VM.
I'd really like to understand what you're seeing, and how to reproduce
it. Do you have a minimal example I could run on my own HW?
Thanks,
M.
--
Jazz is not dead. It just smells funny...
Powered by blists - more mailing lists