linux-kernel - Re: [PATCH v3 0/6] mm: split underutilized THPs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87y150mj6f.fsf@linux.intel.com>
Date: Tue, 13 Aug 2024 10:22:48 -0700
From: Andi Kleen <ak@...ux.intel.com>
To: Usama Arif <usamaarif642@...il.com>
Cc: akpm@...ux-foundation.org,  linux-mm@...ck.org,  hannes@...xchg.org,
  riel@...riel.com,  shakeel.butt@...ux.dev,  roman.gushchin@...ux.dev,
  yuzhao@...gle.com,  david@...hat.com,  baohua@...nel.org,
  ryan.roberts@....com,  rppt@...nel.org,  willy@...radead.org,
  cerasuolodomenico@...il.com,  corbet@....net,
  linux-kernel@...r.kernel.org,  linux-doc@...r.kernel.org,
  kernel-team@...a.com
Subject: Re: [PATCH v3 0/6] mm: split underutilized THPs

Usama Arif <usamaarif642@...il.com> writes:
>
> This patch-series is an attempt to mitigate the issue of running out of
> memory when THP is always enabled. During runtime whenever a THP is being
> faulted in or collapsed by khugepaged, the THP is added to a list.
> Whenever memory reclaim happens, the kernel runs the deferred_split
> shrinker which goes through the list and checks if the THP was underutilized,
> i.e. how many of the base 4K pages of the entire THP were zero-filled.

Sometimes when writing a benchmark I fill things with zero explictly
to avoid faults later. For example if you want to measure memory
read bandwidth you need to fault the pages first, but that fault
pattern may well be zero.

With your patch if there is memory pressure there are two effects:

- If things are remapped to the zero page the benchmark
reading memory may give unrealistically good results because
what is thinks is a big memory area is actually only backed
by a single page.

- If I expect to write I may end up with an unexpected zeropage->real
memory fault if the pages got remapped. 

I expect such patterns can happen without benchmarking too.
I could see it being a problem for latency sensitive applications.

Now you could argue that this all should only happen under memory
pressure and when that happens things may be slow anyways and your
patch will still be an improvement.

Maybe that's true but there might be still corner cases
which are negatively impacted by this. I don't have a good solution
other than a tunable, but I expect it will cause problems for someone.

The other problem I have with your patch is that it may cause the kernel
to pollute CPU caches in the background, which again will cause noise in
the system. Instead of plain memchr_inv, you should probably use some
primitive to bypass caches or use a NTA prefetch hint at least.

-Andi