linux-kernel - Re: [PATCH v11 09/12] mm: implement LUF(Lazy Unmap Flush) defering tlb flush when folios get unmapped

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Zmg7GXK1SGFJNdge@tiehlicka>
Date: Tue, 11 Jun 2024 13:55:05 +0200
From: Michal Hocko <mhocko@...e.com>
To: Byungchul Park <byungchul@...com>
Cc: Matthew Wilcox <willy@...radead.org>,
	Dave Hansen <dave.hansen@...el.com>,
	David Hildenbrand <david@...hat.com>,
	Byungchul Park <lkml.byungchul.park@...il.com>,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org,
	kernel_team@...ynix.com, akpm@...ux-foundation.org,
	ying.huang@...el.com, vernhao@...cent.com,
	mgorman@...hsingularity.net, hughd@...gle.com, peterz@...radead.org,
	luto@...nel.org, tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
	dave.hansen@...ux.intel.com, rjgolo@...il.com
Subject: Re: [PATCH v11 09/12] mm: implement LUF(Lazy Unmap Flush) defering
 tlb flush when folios get unmapped

On Tue 11-06-24 09:55:23, Byungchul Park wrote:
> On Mon, Jun 10, 2024 at 03:23:49PM +0200, Michal Hocko wrote:
> > On Tue 04-06-24 09:34:48, Byungchul Park wrote:
> > > On Mon, Jun 03, 2024 at 06:01:05PM +0100, Matthew Wilcox wrote:
> > > > On Mon, Jun 03, 2024 at 09:37:46AM -0700, Dave Hansen wrote:
> > > > > Yeah, we'd need some equivalent of a PTE marker, but for the page cache.
> > > > >  Presumably some xa_value() that means a reader has to go do a
> > > > > luf_flush() before going any farther.
> > > > 
> > > > I can allocate one for that.  We've got something like 1000 currently
> > > > unused values which can't be mistaken for anything else.
> > > > 
> > > > > That would actually have a chance at fixing two issues:  One where a new
> > > > > page cache insertion is attempted.  The other where someone goes to look
> > > > > in the page cache and takes some action _because_ it is empty (I think
> > > > > NFS is doing some of this for file locks).
> > > > > 
> > > > > LUF is also pretty fundamentally built on the idea that files can't
> > > > > change without LUF being aware.  That model seems to work decently for
> > > > > normal old filesystems on normal old local block devices.  I'm worried
> > > > > about NFS, and I don't know how seriously folks take FUSE, but it
> > > > > obviously can't work well for FUSE.
> > > > 
> > > > I'm more concerned with:
> > > > 
> > > >  - page goes back to buddy
> > > >  - page is allocated to slab
> > > 
> > > At this point, tlb flush needed will be performed in prep_new_page().
> > 
> > But that does mean that an unaware caller would get an additional
> > overhead of the flushing, right? I think it would be just a matter of
> 
> pcp for locality is already a better source of side channel attack.  FYI,
> tlb flush gets barely performed only if pending tlb flush exists.

Right but rare and hard to predict latencies are much worse than
consistent once. 

> > time before somebody can turn that into a side channel attack, not to
> > mention unexpected latencies introduced.
> 
> Nope.  The pending tlb flush performed in prep_new_page() is the one
> that would've done already with the vanilla kernel.  It's not additional
> tlb flushes but it's subset of all the skipped ones.

But those skipped once could have happened in a completely different
context (e.g. a different process or even a diffrent security domain),
right?

> It's worth noting all the existing mm reclaim mechaisms have already
> introduced worse unexpected latencies.

Right, but a reclaim, especially direct reclaim, are expected to be
slow. It is much different to see spike latencies on system with a lot
of memory.
-- 
Michal Hocko
SUSE Labs