[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210506193026.GE388843@casper.infradead.org>
Date: Thu, 6 May 2021 20:30:26 +0100
From: Matthew Wilcox <willy@...radead.org>
To: David Hildenbrand <david@...hat.com>
Cc: Zi Yan <ziy@...dia.com>, Oscar Salvador <osalvador@...e.de>,
Michael Ellerman <mpe@...erman.id.au>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Thomas Gleixner <tglx@...utronix.de>, x86@...nel.org,
Andy Lutomirski <luto@...nel.org>,
"Rafael J . Wysocki" <rafael@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Mike Rapoport <rppt@...nel.org>,
Anshuman Khandual <anshuman.khandual@....com>,
Michal Hocko <mhocko@...e.com>,
Dan Williams <dan.j.williams@...el.com>,
Wei Yang <richard.weiyang@...ux.alibaba.com>,
linux-ia64@...r.kernel.org, linux-kernel@...r.kernel.org,
linuxppc-dev@...ts.ozlabs.org, linux-mm@...ck.org
Subject: Re: [RFC PATCH 0/7] Memory hotplug/hotremove at subsection size
On Thu, May 06, 2021 at 09:10:52PM +0200, David Hildenbrand wrote:
> I have to admit that I am not really a friend of that. I still think our
> target goal should be to have gigantic THP *in addition to* ordinary THP.
> Use gigantic THP where enabled and possible, and just use ordinary THP
> everywhere else. Having one pageblock granularity is a real limitation IMHO
> and requires us to hack the system to support it to some degree.
You're thinking too small with only two THP sizes ;-) I'm aiming to
support arbitrary power-of-two memory allocations. I think there's a
fruitful discussion to be had about how that works for anonymous memory --
with page cache, we have readahead to tell us when our predictions of use
are actually fulfilled. It doesn't tell us what percentage of the pages
allocated were actually used, but it's a hint. It's a big lift to go from
2MB all the way to 1GB ... if you can look back to see that the previous
1GB was basically fully populated, then maybe jump up from allocating
2MB folios to allocating a 1GB folio, but wow, that's a big step.
This goal really does mean that we want to allocate from the page
allocator, and so we do want to grow MAX_ORDER. I suppose we could
do somethig ugly like
if (order <= MAX_ORDER)
alloc_page()
else
alloc_really_big_page()
but that feels like unnecessary hardship to place on the user.
I know that for the initial implementation, we're going to rely on hints
from the user to use 1GB pages, but it'd be nice to not do that.
Powered by blists - more mailing lists