lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d650c29b-129f-4fac-9a9d-ea1fbdae2c3a@intel.com>
Date: Mon, 3 Jun 2024 06:23:46 -0700
From: Dave Hansen <dave.hansen@...el.com>
To: Byungchul Park <byungchul@...com>, David Hildenbrand <david@...hat.com>
Cc: Byungchul Park <lkml.byungchul.park@...il.com>,
 linux-kernel@...r.kernel.org, linux-mm@...ck.org, kernel_team@...ynix.com,
 akpm@...ux-foundation.org, ying.huang@...el.com, vernhao@...cent.com,
 mgorman@...hsingularity.net, hughd@...gle.com, willy@...radead.org,
 peterz@...radead.org, luto@...nel.org, tglx@...utronix.de, mingo@...hat.com,
 bp@...en8.de, dave.hansen@...ux.intel.com, rjgolo@...il.com
Subject: Re: [PATCH v11 09/12] mm: implement LUF(Lazy Unmap Flush) defering
 tlb flush when folios get unmapped

On 6/3/24 02:35, Byungchul Park wrote:
...> In luf's point of view, the points where the deferred flush should be
> performed are simply:
> 
> 	1. when changing the vma maps, that might be luf'ed.
> 	2. when updating data of the pages, that might be luf'ed.

It's simple, but the devil is in the details as always.

> All we need to do is to indentify the points:
> 
> 	1. when changing the vma maps, that might be luf'ed.
> 
> 	   a) mmap and munmap e.i. fault handler or unmap_region().
> 	   b) permission to writable e.i. mprotect or fault handler.
> 	   c) what I'm missing.

I'd say it even more generally: anything that installs a PTE which is
inconsistent with the original PTE.  That, of course, includes writes.
But it also includes crazy things that we do like uprobes.  Take a look
at __replace_page().

I think the page_vma_mapped_walk() checks plus the ptl keep LUF at bay
there.  But it needs some really thorough review.

But the bigger concern is that, if there was a problem, I can't think of
a systematic way to find it.

> 	2. when updating data of the pages, that might be luf'ed.
> 
> 	   a) updating files through vfs e.g. file_end_write().
> 	   b) updating files through writable maps e.i. 1-a) or 1-b).
> 	   c) what I'm missing.

Filesystems or block devices that change content without a "write" from
the local system.  Network filesystems and block devices come to mind.
I honestly don't know what all the rules are around these, but they
could certainly be troublesome.

There appear to be some interactions for NFS between file locking and
page cache flushing.

But, stepping back ...

I'd honestly be a lot more comfortable if there was even a debugging LUF
mode that enforced a rule that said:

  1. A LUF'd PTE can't be rewritten until after a luf_flush() occurs
  2. A LUF'd page's position in the page cache can't be replaced until
     after a luf_flush()

or *some* other independent set of rules that can tell us when something
goes wrong.  That uprobes code, for instance, seems like it will work.
But I can also imagine writing it ten other ways where it would break
when combined with LUF.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ