lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9927f9a3-efba-4053-8384-cc69c7949ea6@intel.com>
Date: Wed, 2 Oct 2024 10:35:24 -0700
From: Dave Hansen <dave.hansen@...el.com>
To: Anthony Yznaga <anthony.yznaga@...cle.com>, akpm@...ux-foundation.org,
 willy@...radead.org, markhemm@...glemail.com, viro@...iv.linux.org.uk,
 david@...hat.com, khalid@...nel.org
Cc: andreyknvl@...il.com, luto@...nel.org, brauner@...nel.org, arnd@...db.de,
 ebiederm@...ssion.com, catalin.marinas@....com, linux-arch@...r.kernel.org,
 linux-kernel@...r.kernel.org, linux-mm@...ck.org, mhiramat@...nel.org,
 rostedt@...dmis.org, vasily.averin@...ux.dev, xhao@...ux.alibaba.com,
 pcc@...gle.com, neilb@...e.de, maz@...nel.org,
 David Rientjes <rientjes@...gle.com>
Subject: Re: [RFC PATCH v3 00/10] Add support for shared PTEs across processes

We were just chatting about this on David Rientjes's MM alignment call.
I thought I'd try to give a little brain

Let's start by thinking about KVM and secondary MMUs.  KVM has a primary
mm: the QEMU (or whatever) process mm.  The virtualization (EPT/NPT)
tables get entries that effectively mirror the primary mm page tables
and constitute a secondary MMU.  If the primary page tables change,
mmu_notifiers ensure that the changes get reflected into the
virtualization tables and also that the virtualization paging structure
caches are flushed.

msharefs is doing something very similar.  But, in the msharefs case,
the secondary MMUs are actually normal CPU MMUs.  The page tables are
normal old page tables and the caches are the normal old TLB.  That's
what makes it so confusing: we have lots of infrastructure for dealing
with that "stuff" (CPU page tables and TLB), but msharefs has
short-circuited the infrastructure and it doesn't work any more.

Basically, I think it makes a lot of sense to check what KVM (or another
mmu_notifier user) is doing and make sure that msharefs is following its
lead.  For instance, KVM _should_ have the exact same "page free"
flushing issue where it gets the MMU notifier call but the page may
still be in the secondary MMU.  I _think_ KVM fixes it with an extra
page refcount that it takes when it first walks the primary page tables.

But the short of it is that the msharefs host mm represents a "secondary
MMU".  I don't think it is really that special of an MMU other than the
fact that it has an mm_struct.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ