[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zmg7GXK1SGFJNdge@tiehlicka>
Date: Tue, 11 Jun 2024 13:55:05 +0200
From: Michal Hocko <mhocko@...e.com>
To: Byungchul Park <byungchul@...com>
Cc: Matthew Wilcox <willy@...radead.org>,
Dave Hansen <dave.hansen@...el.com>,
David Hildenbrand <david@...hat.com>,
Byungchul Park <lkml.byungchul.park@...il.com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
kernel_team@...ynix.com, akpm@...ux-foundation.org,
ying.huang@...el.com, vernhao@...cent.com,
mgorman@...hsingularity.net, hughd@...gle.com, peterz@...radead.org,
luto@...nel.org, tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com, rjgolo@...il.com
Subject: Re: [PATCH v11 09/12] mm: implement LUF(Lazy Unmap Flush) defering
tlb flush when folios get unmapped
On Tue 11-06-24 09:55:23, Byungchul Park wrote:
> On Mon, Jun 10, 2024 at 03:23:49PM +0200, Michal Hocko wrote:
> > On Tue 04-06-24 09:34:48, Byungchul Park wrote:
> > > On Mon, Jun 03, 2024 at 06:01:05PM +0100, Matthew Wilcox wrote:
> > > > On Mon, Jun 03, 2024 at 09:37:46AM -0700, Dave Hansen wrote:
> > > > > Yeah, we'd need some equivalent of a PTE marker, but for the page cache.
> > > > > Presumably some xa_value() that means a reader has to go do a
> > > > > luf_flush() before going any farther.
> > > >
> > > > I can allocate one for that. We've got something like 1000 currently
> > > > unused values which can't be mistaken for anything else.
> > > >
> > > > > That would actually have a chance at fixing two issues: One where a new
> > > > > page cache insertion is attempted. The other where someone goes to look
> > > > > in the page cache and takes some action _because_ it is empty (I think
> > > > > NFS is doing some of this for file locks).
> > > > >
> > > > > LUF is also pretty fundamentally built on the idea that files can't
> > > > > change without LUF being aware. That model seems to work decently for
> > > > > normal old filesystems on normal old local block devices. I'm worried
> > > > > about NFS, and I don't know how seriously folks take FUSE, but it
> > > > > obviously can't work well for FUSE.
> > > >
> > > > I'm more concerned with:
> > > >
> > > > - page goes back to buddy
> > > > - page is allocated to slab
> > >
> > > At this point, tlb flush needed will be performed in prep_new_page().
> >
> > But that does mean that an unaware caller would get an additional
> > overhead of the flushing, right? I think it would be just a matter of
>
> pcp for locality is already a better source of side channel attack. FYI,
> tlb flush gets barely performed only if pending tlb flush exists.
Right but rare and hard to predict latencies are much worse than
consistent once.
> > time before somebody can turn that into a side channel attack, not to
> > mention unexpected latencies introduced.
>
> Nope. The pending tlb flush performed in prep_new_page() is the one
> that would've done already with the vanilla kernel. It's not additional
> tlb flushes but it's subset of all the skipped ones.
But those skipped once could have happened in a completely different
context (e.g. a different process or even a diffrent security domain),
right?
> It's worth noting all the existing mm reclaim mechaisms have already
> introduced worse unexpected latencies.
Right, but a reclaim, especially direct reclaim, are expected to be
slow. It is much different to see spike latencies on system with a lot
of memory.
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists