[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2450e4f8-236f-49ce-8bd3-b30a6d8c5e57@arm.com>
Date: Tue, 2 Jul 2024 09:24:31 +0100
From: Ryan Roberts <ryan.roberts@....com>
To: Yang Shi <shy828301@...il.com>, David Hildenbrand <david@...hat.com>
Cc: Baolin Wang <baolin.wang@...ux.alibaba.com>,
Bang Li <libang.li@...group.com>, hughd@...gle.com,
akpm@...ux-foundation.org, wangkefeng.wang@...wei.com, ziy@...dia.com,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH] support "THPeligible" semantics for mTHP with anonymous
shmem
On 01/07/2024 19:20, Yang Shi wrote:
> On Mon, Jul 1, 2024 at 3:23 AM David Hildenbrand <david@...hat.com> wrote:
>>
>> On 01.07.24 12:16, Ryan Roberts wrote:
>>> On 01/07/2024 10:17, David Hildenbrand wrote:
>>>> On 01.07.24 11:14, Ryan Roberts wrote:
>>>>> On 01/07/2024 09:57, David Hildenbrand wrote:
>>>>>> On 01.07.24 10:50, Ryan Roberts wrote:
>>>>>>> On 01/07/2024 09:48, David Hildenbrand wrote:
>>>>>>>> On 01.07.24 10:40, Ryan Roberts wrote:
>>>>>>>>> On 01/07/2024 09:33, Baolin Wang wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 2024/7/1 15:55, Ryan Roberts wrote:
>>>>>>>>>>> On 28/06/2024 11:49, Bang Li wrote:
>>>>>>>>>>>> After the commit 7fb1b252afb5 ("mm: shmem: add mTHP support for
>>>>>>>>>>>> anonymous shmem"), we can configure different policies through
>>>>>>>>>>>> the multi-size THP sysfs interface for anonymous shmem. But
>>>>>>>>>>>> currently "THPeligible" indicates only whether the mapping is
>>>>>>>>>>>> eligible for allocating THP-pages as well as the THP is PMD
>>>>>>>>>>>> mappable or not for anonymous shmem, we need to support semantics
>>>>>>>>>>>> for mTHP with anonymous shmem similar to those for mTHP with
>>>>>>>>>>>> anonymous memory.
>>>>>>>>>>>>
>>>>>>>>>>>> Signed-off-by: Bang Li <libang.li@...group.com>
>>>>>>>>>>>> ---
>>>>>>>>>>>> fs/proc/task_mmu.c | 10 +++++++---
>>>>>>>>>>>> include/linux/huge_mm.h | 11 +++++++++++
>>>>>>>>>>>> mm/shmem.c | 9 +--------
>>>>>>>>>>>> 3 files changed, 19 insertions(+), 11 deletions(-)
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
>>>>>>>>>>>> index 93fb2c61b154..09b5db356886 100644
>>>>>>>>>>>> --- a/fs/proc/task_mmu.c
>>>>>>>>>>>> +++ b/fs/proc/task_mmu.c
>>>>>>>>>>>> @@ -870,6 +870,7 @@ static int show_smap(struct seq_file *m, void *v)
>>>>>>>>>>>> {
>>>>>>>>>>>> struct vm_area_struct *vma = v;
>>>>>>>>>>>> struct mem_size_stats mss = {};
>>>>>>>>>>>> + bool thp_eligible;
>>>>>>>>>>>> smap_gather_stats(vma, &mss, 0);
>>>>>>>>>>>> @@ -882,9 +883,12 @@ static int show_smap(struct seq_file *m, void
>>>>>>>>>>>> *v)
>>>>>>>>>>>> __show_smap(m, &mss, false);
>>>>>>>>>>>> - seq_printf(m, "THPeligible: %8u\n",
>>>>>>>>>>>> - !!thp_vma_allowable_orders(vma, vma->vm_flags,
>>>>>>>>>>>> - TVA_SMAPS | TVA_ENFORCE_SYSFS, THP_ORDERS_ALL));
>>>>>>>>>>>> + thp_eligible = !!thp_vma_allowable_orders(vma, vma->vm_flags,
>>>>>>>>>>>> + TVA_SMAPS | TVA_ENFORCE_SYSFS, THP_ORDERS_ALL);
>>>>>>>>>>>> + if (vma_is_anon_shmem(vma))
>>>>>>>>>>>> + thp_eligible =
>>>>>>>>>>>> !!shmem_allowable_huge_orders(file_inode(vma->vm_file),
>>>>>>>>>>>> + vma, vma->vm_pgoff, thp_eligible);
>>>>>>>>>>>
>>>>>>>>>>> Afraid I haven't been following the shmem mTHP support work as much as I
>>>>>>>>>>> would
>>>>>>>>>>> have liked, but is there a reason why we need a separate function for
>>>>>>>>>>> shmem?
>>>>>>>>>>
>>>>>>>>>> Since shmem_allowable_huge_orders() only uses shmem specific logic to
>>>>>>>>>> determine
>>>>>>>>>> if huge orders are allowable, there is no need to complicate the
>>>>>>>>>> thp_vma_allowable_orders() function by adding more shmem related logic,
>>>>>>>>>> making
>>>>>>>>>> it more bloated. In my view, providing a dedicated helper
>>>>>>>>>> shmem_allowable_huge_orders(), specifically for shmem, simplifies the logic.
>>>>>>>>>
>>>>>>>>> My point was really that a single interface (thp_vma_allowable_orders)
>>>>>>>>> should be
>>>>>>>>> used to get this information. I have no strong opinon on how the
>>>>>>>>> implementation
>>>>>>>>> of that interface looks. What you suggest below seems perfectly reasonable
>>>>>>>>> to me.
>>>>>>>>
>>>>>>>> Right. thp_vma_allowable_orders() might require some care as discussed in
>>>>>>>> other
>>>>>>>> context (cleanly separate dax and shmem handling/orders). But that would be
>>>>>>>> follow-up cleanups.
>>>>>>>
>>>>>>> Are you planning to do that, or do you want me to send a patch?
>>>>>>
>>>>>> I'm planning on looking into some details, especially the interaction with large
>>>>>> folios in the pagecache. I'll let you know once I have a better idea what
>>>>>> actually should be done :)
>>>>>
>>>>> OK great - I'll scrub it from my todo list... really getting things done today :)
>>>>
>>>> Resolved the khugepaged thiny already? :P
>>>>
>>>> [khugepaged not active when only enabling the sub-size via the 2M folder IIRC]
>>>
>>> Hmm... baby brain?
>>
>> :)
>>
>> I think I only mentioned it in a private mail at some point.
>>
>>>
>>> Sorry about that. I've been a bit useless lately. For some reason it wasn't on
>>> my list, but its there now. Will prioritise it, because I agree it's not good.
>>
>>
>> IIRC, if you do
>>
>> echo never > /sys/kernel/mm/transparent_hugepage/enabled
>> echo always > /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled
>>
>> khugepaged will not get activated.
>
> khugepaged is controlled by the top level knob.
What do you mean by "top level knob"? I assume
/sys/kernel/mm/transparent_hugepage/enabled ?
If so, that's not really a thing in its own right; its just the legacy PMD-size
THP control, and we only take any notice of it if a per-size control is set to
"inherit". So if we have:
# echo always > /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled
Then by design, /sys/kernel/mm/transparent_hugepage/enabled should be ignored.
> But the above setting
> sounds confusing, can we disable the top level knob, but enable it on
> a per-order basis? TBH, it sounds weird and doesn't make too much
> sense to me.
Well that's the design and that's how its documented. It's done this way for
back-compat. All controls are now per-size. But at boot, we default all per-size
controls to "never" except for the PMD-sized control, which is defaulted to
"inherit". That way, an unenlightened user-space can still control PMD-sized THP
via the legacy (top-level) control. But enlightened apps can directly control
per-size.
I'm not sure how your way would work, because you would have 2 controls
competing to do the same thing?
>
>>
>> --
>> Cheers,
>>
>> David / dhildenb
>>
>>
Powered by blists - more mailing lists