lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAG48ez3TTicKSxXyScmqq5Gg91+-KCSk80EccwkbvsQjLzjCFA@mail.gmail.com>
Date: Fri, 30 May 2025 19:46:02 +0200
From: Jann Horn <jannh@...gle.com>
To: Anthony Yznaga <anthony.yznaga@...cle.com>
Cc: akpm@...ux-foundation.org, willy@...radead.org, markhemm@...glemail.com, 
	viro@...iv.linux.org.uk, david@...hat.com, khalid@...nel.org, 
	andreyknvl@...il.com, dave.hansen@...el.com, luto@...nel.org, 
	brauner@...nel.org, arnd@...db.de, ebiederm@...ssion.com, 
	catalin.marinas@....com, linux-arch@...r.kernel.org, 
	linux-kernel@...r.kernel.org, linux-mm@...ck.org, mhiramat@...nel.org, 
	rostedt@...dmis.org, vasily.averin@...ux.dev, xhao@...ux.alibaba.com, 
	pcc@...gle.com, neilb@...e.de, maz@...nel.org
Subject: Re: [PATCH v2 08/20] mm/mshare: flush all TLBs when updating PTEs in
 an mshare range

On Fri, May 30, 2025 at 6:30 PM Anthony Yznaga
<anthony.yznaga@...cle.com> wrote:
> On 5/30/25 7:41 AM, Jann Horn wrote:
> > On Fri, Apr 4, 2025 at 4:18 AM Anthony Yznaga <anthony.yznaga@...cle.com> wrote:
> >> Unlike the mm of a task, an mshare host mm is not updated on context
> >> switch. In particular this means that mm_cpumask is never updated
> >> which results in TLB flushes for updates to mshare PTEs only being
> >> done on the local CPU. To ensure entries are flushed for non-local
> >> TLBs, set up an mmu notifier on the mshare mm and use the
> >> .arch_invalidate_secondary_tlbs callback to flush all TLBs.
> >> arch_invalidate_secondary_tlbs guarantees that TLB entries will be
> >> flushed before pages are freed when unmapping pages in an mshare region.
> >
> > Thanks for working on this, I think this is a really nice feature.
> >
> > An issue that I think this series doesn't address is:
> > There could be mmu_notifiers (for things like KVM or SVA IOMMU) that
> > want to be notified on changes to an mshare VMA; if those are not
> > invoked, we could get UAF of page contents. So either we propagate MMU
> > notifier invocations in the host mm into the mshare regions that use
> > it, or we'd have to somehow prevent a process from using MMU notifiers
> > and mshare at the same time.
>
> Thanks, Jann. I've noted this as an issue. Ultimately I think the
> notifiers calls will need to be propagated. It's going to be tricky, but
> I have some ideas.

Very naively I think you could basically register your own notifier on
the host mm that has notifier callbacks vaguely like this that walk
the rmap of the mshare file and invoke nested mmu notifiers on each
VMA that maps the file, basically like unmap_mapping_pages() except
that you replace unmap_mapping_range_vma() with a notifier invocation?

static int mshare_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn,
    const struct mmu_notifier_range *range)
{
  struct vm_area_struct *vma;
  pgoff_t first_index, last_index;

  if (range->end < host_mm->mmap_base)
    return 0;
  first_index = (max(range->start, host_mm->mmap_base) -
host_mm->mmap_base) / PAGE_SIZE;
  last_index = (range->end - host_mm->mmap_base) / PAGE_SIZE;
  i_mmap_lock_read(mapping);
  vma_interval_tree_foreach(vma, &mapping->i_mmap, first_index, last_index) {
    struct mmu_notifier_range nested_range;

    [... same math as in unmap_mapping_range_tree ...]
    mmu_notifier_range_init(&nested_range, range->event, vma->vm_mm,
nested_start, nested_end);
    mmu_notifier_invalidate_range_start(&nested_range);
  }
  i_mmap_unlock_read(mapping);
}

And ensure that when mm_take_all_locks() encounters an mshare VMA, it
basically recursively does mm_take_all_locks() on the mshare host mm?

I think that might be enough to make it work, and the rest beyond that
would be optimizations?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ