[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <8df9636e-1816-46d2-9f5b-e3029848655d@arm.com>
Date: Fri, 5 Jul 2024 10:05:32 +0100
From: Ryan Roberts <ryan.roberts@....com>
To: Baolin Wang <baolin.wang@...ux.alibaba.com>,
Bang Li <libang.linux@...il.com>, akpm@...ux-foundation.org, hughd@...gle.com
Cc: willy@...radead.org, david@...hat.com, wangkefeng.wang@...wei.com,
ying.huang@...el.com, 21cnbao@...il.com, shy828301@...il.com,
ziy@...dia.com, ioworker0@...il.com, da.gomez@...sung.com,
p.raghav@...sung.com, linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5 4/6] mm: shmem: add mTHP support for anonymous shmem
On 05/07/2024 09:57, Baolin Wang wrote:
>
>
> On 2024/7/5 16:42, Ryan Roberts wrote:
>> On 05/07/2024 04:01, Baolin Wang wrote:
>>>
>>>
>>> On 2024/7/4 22:46, Bang Li wrote:
>>>> Hi Bao lin,
>>>>
>>>> On 2024/7/4 19:15, Baolin Wang wrote:
>>>>>
>>>>>>> +
>>>>>>> + /*
>>>>>>> + * Only allow inherit orders if the top-level value is 'force', which
>>>>>>> + * means non-PMD sized THP can not override 'huge' mount option now.
>>>>>>> + */
>>>>>>> + if (shmem_huge == SHMEM_HUGE_FORCE)
>>>>>>> + return READ_ONCE(huge_shmem_orders_inherit);
>>>>>>
>>>>>> I vaguely recall that we originally discussed that trying to set 'force'
>>>>>> on the
>>>>>> top level control while any per-size controls were set to 'inherit' would
>>>>>> be an
>>>>>> error, and trying to set 'force' on any per-size control except the PMD-size
>>>>>> would be an error?
>>>>>
>>>>> Right.
>>>>>
>>>>>> I don't really understand this logic. Shouldn't we just be looking at the
>>>>>> per-size control settings (or the top-level control as a proxy for every
>>>>>> per-size control that has 'inherit' set)?
>>>>>
>>>>> ‘force’ will apply the huge orders for anon shmem and tmpfs, so now we only
>>>>> allow pmd-mapped THP to be forced. We should not look at per-size control
>>>>> settings for tmpfs now (mTHP for tmpfs will be discussed in future).
>>>>>
>>>>>>
>>>>>> Then for tmpfs, which doesn't support non-PMD-sizes yet, we just always
>>>>>> use the
>>>>>> PMD-size control for decisions.
>>>>>>
>>>>>> I'm also really struggling with the concept of shmem_is_huge() existing along
>>>>>> side shmem_allowable_huge_orders(). Surely this needs to all be refactored
>>>>>> into
>>>>>> shmem_allowable_huge_orders()?
>>>>>
>>>>> I understood. But now they serve different purposes: shmem_is_huge() will be
>>>>> used to check the huge orders for the top level, for *tmpfs* and anon shmem;
>>>>> whereas shmem_allowable_huge_orders() will only be used to check the per-size
>>>>> huge orders for anon shmem (excluding tmpfs now). However, as I plan to add
>>>>> mTHP support for tmpfs, I think we can perform some cleanups.
>>>>
>>>> Please count me in, I'd be happy to contribute to the cleanup and enhancement
>>>> process if I can.
>>>
>>> Good. If you have time, I think you can look at the shmem khugepaged issue from
>>> the previous discussion [1], which I don't have time to look at now.
>>>
>>> "
>>> (3) khugepaged
>>>
>>> khugepaged needs to handle larger folios properly as well. Until fixed,
>>> using smaller THP sizes as fallback might prohibit collapsing a
>>> PMD-sized THP later. But really, khugepaged needs to be fixed to handle
>>> that.
>>> "
>>
>> khugepaged can already collapse "folios of any order less then PMD-size" to
>> PMD-size, for anon memory. Infact I modified the selftest to be able to test
>> that in commit 9f0704eae8a4 ("selftests/mm/khugepaged: enlighten for multi-size
>> THP"). I'd be surprised if khugepaged can't alreay handle the same for shmem?
>
> I did not test this, but from the comment in hpage_collapse_scan_file(), seems
> that compacting small mTHP into a single PMD-mapped THP is not supported yet.
>
> /*
> * TODO: khugepaged should compact smaller compound pages
> * into a PMD sized page
> */
> if (folio_test_large(folio)) {
> result = folio_order(folio) == HPAGE_PMD_ORDER &&
> folio->index == start
> /* Maybe PMD-mapped */
> ? SCAN_PTE_MAPPED_HUGEPAGE
> : SCAN_PAGE_COMPOUND;
> /*
> * For SCAN_PTE_MAPPED_HUGEPAGE, further processing
> * by the caller won't touch the page cache, and so
> * it's safe to skip LRU and refcount checks before
> * returning.
> */
> break;
> }
OK, so the functionality is missing just for shmem/file-backed collapse. Got it.
Sorry for the noise.
>
>> Although the test will definitely want to be extended to test it.
>
> Right.
Powered by blists - more mailing lists