[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7d3457cd-5e3d-42a7-8113-545da646d7c8@redhat.com>
Date: Wed, 6 Dec 2023 11:22:59 +0100
From: David Hildenbrand <david@...hat.com>
To: Ryan Roberts <ryan.roberts@....com>,
Andrew Morton <akpm@...ux-foundation.org>,
Matthew Wilcox <willy@...radead.org>,
Yin Fengwei <fengwei.yin@...el.com>,
Yu Zhao <yuzhao@...gle.com>,
Catalin Marinas <catalin.marinas@....com>,
Anshuman Khandual <anshuman.khandual@....com>,
Yang Shi <shy828301@...il.com>,
"Huang, Ying" <ying.huang@...el.com>, Zi Yan <ziy@...dia.com>,
Luis Chamberlain <mcgrof@...nel.org>,
Itaru Kitayama <itaru.kitayama@...il.com>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
John Hubbard <jhubbard@...dia.com>,
David Rientjes <rientjes@...gle.com>,
Vlastimil Babka <vbabka@...e.cz>,
Hugh Dickins <hughd@...gle.com>,
Kefeng Wang <wangkefeng.wang@...wei.com>,
Barry Song <21cnbao@...il.com>,
Alistair Popple <apopple@...dia.com>
Cc: linux-mm@...ck.org, linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v8 00/10] Multi-size THP for anonymous memory
On 06.12.23 11:13, Ryan Roberts wrote:
> On 05/12/2023 17:21, David Hildenbrand wrote:
>> On 04.12.23 11:20, Ryan Roberts wrote:
>>> Hi All,
>>>
>>> A new week, a new version, a new name... This is v8 of a series to implement
>>> multi-size THP (mTHP) for anonymous memory (previously called "small-sized THP"
>>> and "large anonymous folios"). Matthew objected to "small huge" so hopefully
>>> this fares better.
>>>
>>> The objective of this is to improve performance by allocating larger chunks of
>>> memory during anonymous page faults:
>>>
>>> 1) Since SW (the kernel) is dealing with larger chunks of memory than base
>>> pages, there are efficiency savings to be had; fewer page faults, batched PTE
>>> and RMAP manipulation, reduced lru list, etc. In short, we reduce kernel
>>> overhead. This should benefit all architectures.
>>> 2) Since we are now mapping physically contiguous chunks of memory, we can take
>>> advantage of HW TLB compression techniques. A reduction in TLB pressure
>>> speeds up kernel and user space. arm64 systems have 2 mechanisms to coalesce
>>> TLB entries; "the contiguous bit" (architectural) and HPA (uarch).
>>>
>>> This version changes the name and tidies up some of the kernel code and test
>>> code, based on feedback against v7 (see change log for details).
>>>
>>> By default, the existing behaviour (and performance) is maintained. The user
>>> must explicitly enable multi-size THP to see the performance benefit. This is
>>> done via a new sysfs interface (as recommended by David Hildenbrand - thanks to
>>> David for the suggestion)! This interface is inspired by the existing
>>> per-hugepage-size sysfs interface used by hugetlb, provides full backwards
>>> compatibility with the existing PMD-size THP interface, and provides a base for
>>> future extensibility. See [8] for detailed discussion of the interface.
>>>
>>> This series is based on mm-unstable (715b67adf4c8).
>>
>> I took a look at the core pieces. Some things might want some smaller tweaks,
>> but nothing that should stop this from having fun in mm-unstable, and replacing
>> the smaller things as we move forward.
>>
>
> Thanks! I'll address your comments and see if I can post another (final??)
> version next week.
It's always possible to do incremental changes on top that Andrew will
squash in the end. I even recall that he prefers that way once a series
has been in mm-unstable for a bit, so one can better observe the diff
and which effects they have.
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists