lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAG48ez0=9O-V0V6v_LUgRcF46BooJdk3eqb6xgDpKpNZuW1L2A@mail.gmail.com>
Date: Mon, 14 Oct 2024 22:07:52 +0200
From: Jann Horn <jannh@...gle.com>
To: Anthony Yznaga <anthony.yznaga@...cle.com>
Cc: akpm@...ux-foundation.org, willy@...radead.org, markhemm@...glemail.com, 
	viro@...iv.linux.org.uk, david@...hat.com, khalid@...nel.org, 
	andreyknvl@...il.com, dave.hansen@...el.com, luto@...nel.org, 
	brauner@...nel.org, arnd@...db.de, ebiederm@...ssion.com, 
	catalin.marinas@....com, linux-arch@...r.kernel.org, 
	linux-kernel@...r.kernel.org, linux-mm@...ck.org, mhiramat@...nel.org, 
	rostedt@...dmis.org, vasily.averin@...ux.dev, xhao@...ux.alibaba.com, 
	pcc@...gle.com, neilb@...e.de, maz@...nel.org
Subject: Re: [RFC PATCH v3 00/10] Add support for shared PTEs across processes

On Wed, Sep 4, 2024 at 1:22 AM Anthony Yznaga <anthony.yznaga@...cle.com> wrote:
> One major issue to address for this series to function correctly
> is how to ensure proper TLB flushing when a page in a shared
> region is unmapped. For example, since the rmaps for pages in a
> shared region map back to host vmas which point to a host mm, TLB
> flushes won't be directed to the CPUs the sharing processes have
> run on. I am by no means an expert in this area. One idea is to
> install a mmu_notifier on the host mm that can gather the necessary
> data and do flushes similar to the batch flushing.

The mmu_notifier API has two ways you can use it:

First, there is the classic mode, where before you start modifying
PTEs in some range, you remove mirrored PTEs from some other context,
and until you're done with your PTE modification, you don't allow
creation of new mirrored PTEs. This is intended for cases where
individual PTE entries are copied over to some other context (such as
EPT tables for virtualization). When I last looked at that code, it
looked fine, and this is what KVM uses. But it probably doesn't match
your usecase, since you wouldn't want removal of a single page to
cause the entire page table containing it to be temporarily unmapped
from the processes that use it?

Second, there is a newer mode for IOMMUv2 stuff (using the
mmu_notifier_ops::invalidate_range callback), where the idea is that
you have secondary MMUs that share the normal page tables, and so you
basically send them invalidations at the same time you invalidate the
primary MMU for the process. I think that's the right fit for this
usecase; however, last I looked, this code was extremely broken (see
https://lore.kernel.org/lkml/CAG48ez2NQKVbv=yG_fq_jtZjf8Q=+Wy54FxcFrK_OujFg5BwSQ@mail.gmail.com/
for context). Unless that's changed in the meantime, I think someone
would have to fix that code before it can be relied on for new
usecases.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ