[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d650c29b-129f-4fac-9a9d-ea1fbdae2c3a@intel.com>
Date: Mon, 3 Jun 2024 06:23:46 -0700
From: Dave Hansen <dave.hansen@...el.com>
To: Byungchul Park <byungchul@...com>, David Hildenbrand <david@...hat.com>
Cc: Byungchul Park <lkml.byungchul.park@...il.com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org, kernel_team@...ynix.com,
akpm@...ux-foundation.org, ying.huang@...el.com, vernhao@...cent.com,
mgorman@...hsingularity.net, hughd@...gle.com, willy@...radead.org,
peterz@...radead.org, luto@...nel.org, tglx@...utronix.de, mingo@...hat.com,
bp@...en8.de, dave.hansen@...ux.intel.com, rjgolo@...il.com
Subject: Re: [PATCH v11 09/12] mm: implement LUF(Lazy Unmap Flush) defering
tlb flush when folios get unmapped
On 6/3/24 02:35, Byungchul Park wrote:
...> In luf's point of view, the points where the deferred flush should be
> performed are simply:
>
> 1. when changing the vma maps, that might be luf'ed.
> 2. when updating data of the pages, that might be luf'ed.
It's simple, but the devil is in the details as always.
> All we need to do is to indentify the points:
>
> 1. when changing the vma maps, that might be luf'ed.
>
> a) mmap and munmap e.i. fault handler or unmap_region().
> b) permission to writable e.i. mprotect or fault handler.
> c) what I'm missing.
I'd say it even more generally: anything that installs a PTE which is
inconsistent with the original PTE. That, of course, includes writes.
But it also includes crazy things that we do like uprobes. Take a look
at __replace_page().
I think the page_vma_mapped_walk() checks plus the ptl keep LUF at bay
there. But it needs some really thorough review.
But the bigger concern is that, if there was a problem, I can't think of
a systematic way to find it.
> 2. when updating data of the pages, that might be luf'ed.
>
> a) updating files through vfs e.g. file_end_write().
> b) updating files through writable maps e.i. 1-a) or 1-b).
> c) what I'm missing.
Filesystems or block devices that change content without a "write" from
the local system. Network filesystems and block devices come to mind.
I honestly don't know what all the rules are around these, but they
could certainly be troublesome.
There appear to be some interactions for NFS between file locking and
page cache flushing.
But, stepping back ...
I'd honestly be a lot more comfortable if there was even a debugging LUF
mode that enforced a rule that said:
1. A LUF'd PTE can't be rewritten until after a luf_flush() occurs
2. A LUF'd page's position in the page cache can't be replaced until
after a luf_flush()
or *some* other independent set of rules that can tell us when something
goes wrong. That uprobes code, for instance, seems like it will work.
But I can also imagine writing it ten other ways where it would break
when combined with LUF.
Powered by blists - more mailing lists