linux-kernel - Re: [PATCH v11 09/12] mm: implement LUF(Lazy Unmap Flush) defering tlb flush when folios get unmapped

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <26dc4594-430b-483c-a26c-7e68bade74b0@redhat.com>
Date: Sat, 1 Jun 2024 09:22:17 +0200
From: David Hildenbrand <david@...hat.com>
To: Dave Hansen <dave.hansen@...el.com>,
 Byungchul Park <lkml.byungchul.park@...il.com>
Cc: Byungchul Park <byungchul@...com>, linux-kernel@...r.kernel.org,
 linux-mm@...ck.org, kernel_team@...ynix.com, akpm@...ux-foundation.org,
 ying.huang@...el.com, vernhao@...cent.com, mgorman@...hsingularity.net,
 hughd@...gle.com, willy@...radead.org, peterz@...radead.org,
 luto@...nel.org, tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
 dave.hansen@...ux.intel.com, rjgolo@...il.com
Subject: Re: [PATCH v11 09/12] mm: implement LUF(Lazy Unmap Flush) defering
 tlb flush when folios get unmapped

On 31.05.24 23:46, Dave Hansen wrote:
> On 5/31/24 11:04, Byungchul Park wrote:
> ...
>> I don't believe you do not agree with the concept itself.  Thing is
>> the current version is not good enough.  I will do my best by doing
>> what I can do.
> 
> More performance is good.  I agree with that.
> 
> But it has to be weighed against the risk and the complexity.  The more
> I look at this approach, the more I think this is not a good trade off.
> There's a lot of risk and a lot of complexity and we haven't seen the
> full complexity picture.  The gaps are being fixed by adding complexity
> in new subsystems (the VFS in this case).
> 
> There are going to be winners and losers, and this version for example
> makes file writes lose performance.
> 
> Just to be crystal clear: I disagree with the concept of leaving stale
> TLB entries in place in an attempt to gain performance.

There is the inherent problem that a CPU reading from such (unmapped but 
not flushed yet) memory will not get a page fault, which I think is the 
most controversial part here (besides interaction with other deferred 
TLB flushing, and how this glues into the buddy).

What we used to do so far was limiting the timeframe where that could 
happen, under well-controlled circumstances. On the common unmap/zap 
path, we perform the batched TLB flush before any page faults / VMA 
changes would have be possible and munmap() would have returned with 
"succeess". Now that time frame could be significantly longer.

So in current code, at the point in time where we would process a page 
fault, mmap()/munmap()/... the TLB would have been flushed already.

To "mimic" the old behavior, we'd essentially have to force any page 
faults/mmap/whatsoever to perform the deferred flush such that the CPU 
will see the "reality" again. Not sure how that could be done in a 
*consistent* way (check whenever we take the mmap/vma lock etc ...) and 
if there would still be a performance win.

-- 
Cheers,

David / dhildenb