[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGsJ_4zU6pHOwEzCQxo9UWUFmHZp0pvfgF4pKbkFfDPCRyFwyw@mail.gmail.com>
Date: Tue, 5 Dec 2023 17:21:09 +1300
From: Barry Song <21cnbao@...il.com>
To: Ryan Roberts <ryan.roberts@....com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Matthew Wilcox <willy@...radead.org>,
Yin Fengwei <fengwei.yin@...el.com>,
David Hildenbrand <david@...hat.com>,
Yu Zhao <yuzhao@...gle.com>,
Catalin Marinas <catalin.marinas@....com>,
Anshuman Khandual <anshuman.khandual@....com>,
Yang Shi <shy828301@...il.com>,
"Huang, Ying" <ying.huang@...el.com>, Zi Yan <ziy@...dia.com>,
Luis Chamberlain <mcgrof@...nel.org>,
Itaru Kitayama <itaru.kitayama@...il.com>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
John Hubbard <jhubbard@...dia.com>,
David Rientjes <rientjes@...gle.com>,
Vlastimil Babka <vbabka@...e.cz>,
Hugh Dickins <hughd@...gle.com>,
Kefeng Wang <wangkefeng.wang@...wei.com>,
Alistair Popple <apopple@...dia.com>, linux-mm@...ck.org,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v8 03/10] mm: thp: Introduce multi-size THP sysfs interface
On Mon, Dec 4, 2023 at 11:21 PM Ryan Roberts <ryan.roberts@....com> wrote:
>
> In preparation for adding support for anonymous multi-size THP,
> introduce new sysfs structure that will be used to control the new
> behaviours. A new directory is added under transparent_hugepage for each
> supported THP size, and contains an `enabled` file, which can be set to
> "inherit" (to inherit the global setting), "always", "madvise" or
> "never". For now, the kernel still only supports PMD-sized anonymous
> THP, so only 1 directory is populated.
>
> The first half of the change converts transhuge_vma_suitable() and
> hugepage_vma_check() so that they take a bitfield of orders for which
> the user wants to determine support, and the functions filter out all
> the orders that can't be supported, given the current sysfs
> configuration and the VMA dimensions. If there is only 1 order set in
> the input then the output can continue to be treated like a boolean;
> this is the case for most call sites. The resulting functions are
> renamed to thp_vma_suitable_orders() and thp_vma_allowable_orders()
> respectively.
>
> The second half of the change implements the new sysfs interface. It has
> been done so that each supported THP size has a `struct thpsize`, which
> describes the relevant metadata and is itself a kobject. This is pretty
> minimal for now, but should make it easy to add new per-thpsize files to
> the interface if needed in future (e.g. per-size defrag). Rather than
> keep the `enabled` state directly in the struct thpsize, I've elected to
> directly encode it into huge_anon_orders_[always|madvise|inherit]
> bitfields since this reduces the amount of work required in
> thp_vma_allowable_orders() which is called for every page fault.
>
> See Documentation/admin-guide/mm/transhuge.rst, as modified by this
> commit, for details of how the new sysfs interface works.
>
> Signed-off-by: Ryan Roberts <ryan.roberts@....com>
Reviewed-by: Barry Song <v-songbaohua@...o.com>
> -khugepaged will be automatically started when
> -transparent_hugepage/enabled is set to "always" or "madvise, and it'll
> -be automatically shutdown if it's set to "never".
> +khugepaged will be automatically started when one or more hugepage
> +sizes are enabled (either by directly setting "always" or "madvise",
> +or by setting "inherit" while the top-level enabled is set to "always"
> +or "madvise"), and it'll be automatically shutdown when the last
> +hugepage size is disabled (either by directly setting "never", or by
> +setting "inherit" while the top-level enabled is set to "never").
>
> Khugepaged controls
> -------------------
>
> +.. note::
> + khugepaged currently only searches for opportunities to collapse to
> + PMD-sized THP and no attempt is made to collapse to other THP
> + sizes.
For small-size THP, collapse is probably a bad idea. we like a one-shot
try in Android especially we are using a 64KB and less large folio size. if
PF succeeds in getting large folios, we map large folios, otherwise we
give up as those memories can be quite unstably swapped-out, swapped-in
and madvised to be DONTNEED.
too many compactions will increase power consumption and decrease UI
response.
Thanks
Barry
Powered by blists - more mailing lists