lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <accf2b4b-2a54-4261-b67e-010cb74082ae@intel.com>
Date: Wed, 2 Oct 2024 16:11:27 -0700
From: Dave Hansen <dave.hansen@...el.com>
To: Anthony Yznaga <anthony.yznaga@...cle.com>, akpm@...ux-foundation.org,
 willy@...radead.org, markhemm@...glemail.com, viro@...iv.linux.org.uk,
 david@...hat.com, khalid@...nel.org
Cc: andreyknvl@...il.com, luto@...nel.org, brauner@...nel.org, arnd@...db.de,
 ebiederm@...ssion.com, catalin.marinas@....com, linux-arch@...r.kernel.org,
 linux-kernel@...r.kernel.org, linux-mm@...ck.org, mhiramat@...nel.org,
 rostedt@...dmis.org, vasily.averin@...ux.dev, xhao@...ux.alibaba.com,
 pcc@...gle.com, neilb@...e.de, maz@...nel.org,
 David Rientjes <rientjes@...gle.com>
Subject: Re: [RFC PATCH v3 00/10] Add support for shared PTEs across processes

About TLB flushing...

The quick and dirty thing to do is just flush_tlb_all() after you remove
the PTE from the host mm.  That will surely work everywhere and it's as
dirt simple as you get.  Honestly, it might even be cheaper than the
alternative.

Also, I don't think PCIDs actually complicate the problem at all.  We
basically do remote mm TLB flushes using two mechanisms:

	1. If the mm is loaded, use INVLPG and friends to zap the TLB
	2. Bump mm->context.tlb_gen so that the next time it _gets_
	   loaded, the TLB is flushed.

flush_tlb_func() really only cares about #1 since if the mm isn't
loaded, it'll get flushed anyway at the next context switch.

The alternatives I can think of:

Make flush_tlb_mm_range(host_mm) work somehow.  You'd need to somehow
keep mm_cpumask(host_mm) up to date and also make do something to
flush_tlb_func() to tell it that 'loaded_mm' isn't relevant and it
should flush regardless.

The other way is to use the msharefs's inode ->i_mmap to find all the
VMAs mapping the file, and find all *their* mm's:

	for each vma in inode->i_mmap
		mm = vma->vm_mm
		flush_tlb_mm_range(<vma range here>)

But that might be even worse than flush_tlb_all() because it might end
up sending more than one IPI per CPU.

You can fix _that_ by keeping a single cpumask that you build up:

	mask = 0
	for each vma in inode->i_mmap
		mm = vma->vm_mm
		mask |= mm_cpumask(mm)

	flush_tlb_multi(mask, info);

Unfortunately, 'info->mm' needs to be more than one mm, so you probably
still need a new flush_tlb_func() flush type to tell it to ignore
'info->mm' and flush anyway.

After all that, I kinda like flush_tlb_all(). ;)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ