[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7c13be04-1d18-45bd-8cfc-f5d37bd39a8e@redhat.com>
Date: Mon, 7 Oct 2024 18:27:39 +0200
From: David Hildenbrand <david@...hat.com>
To: Dave Hansen <dave.hansen@...el.com>,
Anthony Yznaga <anthony.yznaga@...cle.com>, akpm@...ux-foundation.org,
willy@...radead.org, markhemm@...glemail.com, viro@...iv.linux.org.uk,
khalid@...nel.org
Cc: andreyknvl@...il.com, luto@...nel.org, brauner@...nel.org, arnd@...db.de,
ebiederm@...ssion.com, catalin.marinas@....com, linux-arch@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-mm@...ck.org, mhiramat@...nel.org,
rostedt@...dmis.org, vasily.averin@...ux.dev, xhao@...ux.alibaba.com,
pcc@...gle.com, neilb@...e.de, maz@...nel.org,
David Rientjes <rientjes@...gle.com>
Subject: Re: [RFC PATCH v3 00/10] Add support for shared PTEs across processes
On 07.10.24 17:58, Dave Hansen wrote:
> On 10/7/24 01:44, David Hildenbrand wrote:
>> On 02.10.24 19:35, Dave Hansen wrote:
>>> We were just chatting about this on David Rientjes's MM alignment call.
>>
>> Unfortunately I was not able to attend this time, my body decided it's a
>> good idea to stay in bed for a couple of days.
>>
>>> I thought I'd try to give a little brain
>>>
>>> Let's start by thinking about KVM and secondary MMUs. KVM has a primary
>>> mm: the QEMU (or whatever) process mm. The virtualization (EPT/NPT)
>>> tables get entries that effectively mirror the primary mm page tables
>>> and constitute a secondary MMU. If the primary page tables change,
>>> mmu_notifiers ensure that the changes get reflected into the
>>> virtualization tables and also that the virtualization paging structure
>>> caches are flushed.
>>>
>>> msharefs is doing something very similar. But, in the msharefs case,
>>> the secondary MMUs are actually normal CPU MMUs. The page tables are
>>> normal old page tables and the caches are the normal old TLB. That's
>>> what makes it so confusing: we have lots of infrastructure for dealing
>>> with that "stuff" (CPU page tables and TLB), but msharefs has
>>> short-circuited the infrastructure and it doesn't work any more.
>>
>> It's quite different IMHO, to a degree that I believe they are different
>> beasts:
>>
>> Secondary MMUs:
>> * "Belongs" to same MM context and the primary MMU (process page tables)
>
> I think you're speaking to the ratio here. For each secondary MMU, I
> think you're saying that there's one and only one mm_struct. Is that right?
Yes, that is my understanding (at least with KVM). It's a secondary MMU
derived from exactly one primary MMU (MM context -> page table hierarchy).
The sophisticated ( :) ) notifier mechanism when updating the primary
MMU will result in keeping the secondary MMU in sync (of course, what to
sync, and how to, depends in KVM on the memory slot that define how the
guest physical memory layout is derived from the process virtual address
space).
>
>> * Maintains separate tables/PTEs, in completely separate page table
>> hierarchy
>
> This is the case for KVM and the VMX/SVM MMUs, but it's not generally
> true about hardware. IOMMUs can walk x86 page tables and populate the
> IOTLB from the _same_ page table hierarchy as the CPU.
Yes, of course.
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists