lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 22 Nov 2021 04:56:25 +0000
From:   Matthew Wilcox <willy@...radead.org>
To:     Shakeel Butt <shakeelb@...gle.com>
Cc:     David Hildenbrand <david@...hat.com>,
        "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
        Yang Shi <shy828301@...il.com>, Zi Yan <ziy@...dia.com>,
        Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm: split thp synchronously on MADV_DONTNEED

On Sat, Nov 20, 2021 at 12:12:30PM -0800, Shakeel Butt wrote:
> Many applications do sophisticated management of their heap memory for
> better performance but with low cost. We have a bunch of such
> applications running on our production and examples include caching and
> data storage services. These applications keep their hot data on the
> THPs for better performance and release the cold data through
> MADV_DONTNEED to keep the memory cost low.
> 
> The kernel defers the split and release of THPs until there is memory
> pressure. This causes complicates the memory management of these
> sophisticated applications which then needs to look into low level
> kernel handling of THPs to better gauge their headroom for expansion. In
> addition these applications are very latency sensitive and would prefer
> to not face memory reclaim due to non-deterministic nature of reclaim.
> 
> This patch let such applications not worry about the low level handling
> of THPs in the kernel and splits the THPs synchronously on
> MADV_DONTNEED.

I've been wondering about whether this is really the right strategy
(and this goes wider than just this one, new case)

We chose to use a 2MB page here, based on whatever heuristics are
currently in play.  Now userspace is telling us we were wrong and should
have used smaller pages.  2MB pages are precious, and we currently
have one.  Surely it is better to migrate the still-valid contents of
this 2MB page to smaller pages, and then free the 2MB page as a single
unit than it is to fragment this 2MB page into smaller chunks, and keep
using some of it, virtually guaranteeing this particular 2MB page can't
be reassembled without significant work?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ