[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f65baa0c-3f5a-45ad-80dd-c240fe166877@arm.com>
Date: Wed, 6 Dec 2023 10:13:21 +0000
From: Ryan Roberts <ryan.roberts@....com>
To: David Hildenbrand <david@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Matthew Wilcox <willy@...radead.org>,
Yin Fengwei <fengwei.yin@...el.com>,
Yu Zhao <yuzhao@...gle.com>,
Catalin Marinas <catalin.marinas@....com>,
Anshuman Khandual <anshuman.khandual@....com>,
Yang Shi <shy828301@...il.com>,
"Huang, Ying" <ying.huang@...el.com>, Zi Yan <ziy@...dia.com>,
Luis Chamberlain <mcgrof@...nel.org>,
Itaru Kitayama <itaru.kitayama@...il.com>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
John Hubbard <jhubbard@...dia.com>,
David Rientjes <rientjes@...gle.com>,
Vlastimil Babka <vbabka@...e.cz>,
Hugh Dickins <hughd@...gle.com>,
Kefeng Wang <wangkefeng.wang@...wei.com>,
Barry Song <21cnbao@...il.com>,
Alistair Popple <apopple@...dia.com>
Cc: linux-mm@...ck.org, linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v8 00/10] Multi-size THP for anonymous memory
On 05/12/2023 17:21, David Hildenbrand wrote:
> On 04.12.23 11:20, Ryan Roberts wrote:
>> Hi All,
>>
>> A new week, a new version, a new name... This is v8 of a series to implement
>> multi-size THP (mTHP) for anonymous memory (previously called "small-sized THP"
>> and "large anonymous folios"). Matthew objected to "small huge" so hopefully
>> this fares better.
>>
>> The objective of this is to improve performance by allocating larger chunks of
>> memory during anonymous page faults:
>>
>> 1) Since SW (the kernel) is dealing with larger chunks of memory than base
>> pages, there are efficiency savings to be had; fewer page faults, batched PTE
>> and RMAP manipulation, reduced lru list, etc. In short, we reduce kernel
>> overhead. This should benefit all architectures.
>> 2) Since we are now mapping physically contiguous chunks of memory, we can take
>> advantage of HW TLB compression techniques. A reduction in TLB pressure
>> speeds up kernel and user space. arm64 systems have 2 mechanisms to coalesce
>> TLB entries; "the contiguous bit" (architectural) and HPA (uarch).
>>
>> This version changes the name and tidies up some of the kernel code and test
>> code, based on feedback against v7 (see change log for details).
>>
>> By default, the existing behaviour (and performance) is maintained. The user
>> must explicitly enable multi-size THP to see the performance benefit. This is
>> done via a new sysfs interface (as recommended by David Hildenbrand - thanks to
>> David for the suggestion)! This interface is inspired by the existing
>> per-hugepage-size sysfs interface used by hugetlb, provides full backwards
>> compatibility with the existing PMD-size THP interface, and provides a base for
>> future extensibility. See [8] for detailed discussion of the interface.
>>
>> This series is based on mm-unstable (715b67adf4c8).
>
> I took a look at the core pieces. Some things might want some smaller tweaks,
> but nothing that should stop this from having fun in mm-unstable, and replacing
> the smaller things as we move forward.
>
Thanks! I'll address your comments and see if I can post another (final??)
version next week.
Powered by blists - more mailing lists