linux-kernel - Re: [Discuss] First steps for ASI (ASI is fast again)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <aKimU6tf7-RnwISE@casper.infradead.org>
Date: Fri, 22 Aug 2025 18:18:11 +0100
From: Matthew Wilcox <willy@...radead.org>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: Brendan Jackman <jackmanb@...gle.com>, peterz@...radead.org,
	bp@...en8.de, dave.hansen@...ux.intel.com, mingo@...hat.com,
	tglx@...utronix.de, akpm@...ux-foundation.org, david@...hat.com,
	derkling@...gle.com, junaids@...gle.com,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org, reijiw@...gle.com,
	rientjes@...gle.com, rppt@...nel.org, vbabka@...e.cz,
	x86@...nel.org, yosry.ahmed@...ux.dev,
	Liam Howlett <liam.howlett@...cle.com>,
	"Kirill A. Shutemov" <kas@...nel.org>,
	Harry Yoo <harry.yoo@...cle.com>, Jann Horn <jannh@...gle.com>,
	Pedro Falcato <pfalcato@...e.de>, Andy Lutomirski <luto@...nel.org>,
	Josh Poimboeuf <jpoimboe@...nel.org>, Kees Cook <kees@...nel.org>
Subject: Re: [Discuss] First steps for ASI (ASI is fast again)

On Fri, Aug 22, 2025 at 03:22:04PM +0100, Lorenzo Stoakes wrote:
> > What I think we can do is an mm-global flush whenever there's a
> > possibility that the process is losing logical access to a physical
> > page. So basically I think that's whenever we evict from the page cache,
> > or the user closes a file.
> >
> > ("Logical access" = we would let them do a read() that gives them the
> > contents of the page).
> >
> > The key insight is that a) those events are reeelatively rare and b)
> > already often involve big TLB flushes. So doing global flushes there is
> > not that bad, and this allows us to forget about all the particular
> > details of which pages might have TLB entries on which CPUs and just say
> > "_some_ CPU in this MM might have _some_ stale TLB entry", which is
> > simple and efficient to track.
> 
> I guess rare to get truncation mid-way through a read(), closing it mid-way
> would be... a bug surely? :P

Truncation isn't a problem.  The contents of the file were visible to
the process before.  The folio can't get recycled while we have a
reference to it.  You might get stale data, but that's just the race
going one way instead of the other.

> > > Hmm, CoW generally a pain. Could you go into more detail as to what's the issue
> > > here?
> >
> > It's just that you have two user pages that you wanna touch at once
> > (src, dst). This crappy ephmap implementation doesn't suppport two
> > mappings at once in the same context, so the second allocation fails, so
> > you always get an asi_exit().
> 
> Right... well like can we just have space for 2 then? ;) it's mappings not
> actually allocating pages so... :)

For reference, kmap_local/atomic supports up to 16 at once.  That may
be excessive, but it's cheap.  Of course, kmap only supports a single
page at a time, not an entire folio.  Now, the tradeoffs for kmap_local
are based on how much address space is available to a 32-bit process (ie
1GB, shared between lowmem, vmalloc space, ioremap space, kmap space,
and probably a bunch of things I'm forgetting.

There's MUCH more space available on 64-bit and I'm sure we can find
32MB to allow us to map 16 * 2MB folios.  We can even make it easy and
always map on 2MB boundaries.  We might get into A Bit Of Trouble if
we decide that we want to map x86 1GB pages or ARM 512MB (I think ARM
actually goes up to 4TB theoretically).

If we're going this way, we might want to rework
folio_test_partial_kmap() callers to instead ask "what is the mapped
boundary of this folio", which might actually clean them up a bit.