[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f84c2b88-9912-d716-e83e-749fbfb6ff30@redhat.com>
Date: Mon, 5 Oct 2020 19:39:53 +0200
From: David Hildenbrand <david@...hat.com>
To: Zi Yan <ziy@...dia.com>, Michal Hocko <mhocko@...e.com>
Cc: linux-mm@...ck.org,
"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
Rik van Riel <riel@...riel.com>,
Roman Gushchin <guro@...com>,
Matthew Wilcox <willy@...radead.org>,
Shakeel Butt <shakeelb@...gle.com>,
Yang Shi <shy828301@...il.com>,
Jason Gunthorpe <jgg@...dia.com>,
Mike Kravetz <mike.kravetz@...cle.com>,
William Kucharski <william.kucharski@...cle.com>,
Andrea Arcangeli <aarcange@...hat.com>,
John Hubbard <jhubbard@...dia.com>,
David Nellans <dnellans@...dia.com>,
linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH v2 00/30] 1GB PUD THP support on x86_64
>>> consideting that 2MB THP have turned out to be quite a pain but
>>> situation has settled over time. Maybe our current code base is prepared
>>> for that much better.
>
> I am planning to refactor my code further to reduce the amount of
> the added code, since PUD THP is very similar to PMD THP. One thing
> I want to achieve is to enable split_huge_page to split any order of
> pages to a group of any lower order of pages. A lot of code in this
> patchset is replicating the same behavior of PMD THP at PUD level.
> It might be possible to deduplicate most of the code.
>
>>>
>>> Exposing that interface to the userspace is a different story of course.
>>> I do agree that we likely do not want to be very explicit about that.
>>> E.g. an interface for address space defragmentation without any more
>>> specifics sounds like a useful feature to me. It will be up to the
>>> kernel to decide which huge pages to use.
>>
>> Yes, I think one important feature would be that we don't end up placing
>> a gigantic page where only a handful of pages are actually populated
>> without green light from the application - because that's what some user
>> space applications care about (not consuming more memory than intended.
>> IIUC, this is also what this patch set does). I'm fine with placing
>> gigantic pages if it really just "defragments" the address space layout,
>> without filling unpopulated holes.
>>
>> Then, this would be mostly invisible to user space, and we really
>> wouldn't have to care about any configuration.
>
>
> I agree that the interface should be as simple as no configuration to
> most users. But I also wonder why we have hugetlbfs to allow users to
> specify different kinds of page sizes, which seems against the discussion
> above. Are we assuming advanced users should always use hugetlbfs instead
> of THPs?
Well, with hugetlbfs you get a real control over which pagesizes to use.
No mixture, guarantees.
In some environments you might want to control which application gets
which pagesize. I know of database applications and hypervisors that
sometimes really want 2MB huge pages instead of 1GB huge pages. And
sometimes you really want/need 1GB huge pages (e.g., low-latency
applications, real-time KVM, ...).
Simple example: KVM with postcopy live migration
While 2MB huge pages work reasonably fine, migrating 1GB gigantic pages
on demand (via userfaultdfd) is a painfully slow / impractical.
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists