[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4e1a1878-4133-4d78-90fa-1d5bc99d179c@arm.com>
Date: Thu, 27 Jun 2024 10:27:14 +0100
From: Ryan Roberts <ryan.roberts@....com>
To: Barry Song <21cnbao@...il.com>
Cc: Zi Yan <ziy@...dia.com>, ran xiaokai <ranxiaokai627@....com>,
akpm@...ux-foundation.org, willy@...radead.org, vbabka@...e.cz,
svetly.todorov@...verge.com, ran.xiaokai@....com.cn, peterx@...hat.com,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
linux-fsdevel@...r.kernel.org, David Hildenbrand <david@...hat.com>,
Baolin Wang <baolin.wang@...ux.alibaba.com>,
Kefeng Wang <wangkefeng.wang@...wei.com>, Lance Yang <ioworker0@...il.com>
Subject: Re: [PATCH 2/2] kpageflags: fix wrong KPF_THP on non-pmd-mappable
compound pages
On 27/06/2024 10:16, Barry Song wrote:
> On Thu, Jun 27, 2024 at 8:39 PM Ryan Roberts <ryan.roberts@....com> wrote:
>>
>> On 27/06/2024 05:10, Barry Song wrote:
>>> On Thu, Jun 27, 2024 at 2:40 AM Zi Yan <ziy@...dia.com> wrote:
>>>>
>>>> On Wed Jun 26, 2024 at 7:07 AM EDT, Ryan Roberts wrote:
>>>>> On 26/06/2024 04:06, Zi Yan wrote:
>>>>>> On Tue Jun 25, 2024 at 10:49 PM EDT, ran xiaokai wrote:
>>>>>>> From: Ran Xiaokai <ran.xiaokai@....com.cn>
>>>>>>>
>>>>>>> KPF_COMPOUND_HEAD and KPF_COMPOUND_TAIL are set on "common" compound
>>>>>>> pages, which means of any order, but KPF_THP should only be set
>>>>>>> when the folio is a 2M pmd mappable THP.
>>>>>
>>>>> Why should KPF_THP only be set on 2M THP? What problem does it cause as it is
>>>>> currently configured?
>>>>>
>>>>> I would argue that mTHP is still THP so should still have the flag. And since
>>>>> these smaller mTHP sizes are disabled by default, only mTHP-aware user space
>>>>> will be enabling them, so I'll naively state that it should not cause compat
>>>>> issues as is.
>>>>>
>>>>> Also, the script at tools/mm/thpmaps relies on KPF_THP being set for all mTHP
>>>>> sizes to function correctly. So that would need to be reworked if making this
>>>>> change.
>>>>
>>>> + more folks working on mTHP
>>>>
>>>> I agree that mTHP is still THP, but we might want different
>>>> stats/counters for it, since people might want to keep the old THP counters
>>>> consistent. See recent commits on adding mTHP counters:
>>>> ec33687c6749 ("mm: add per-order mTHP anon_fault_alloc and anon_fault_fallback
>>>> counters"), 1f97fd042f38 ("mm: shmem: add mTHP counters for anonymous shmem")
>>>>
>>>> and changes to make THP counter to only count PMD THP:
>>>> 835c3a25aa37 ("mm: huge_memory: add the missing folio_test_pmd_mappable() for
>>>> THP split statistics")
>>>>
>>>> In this case, I wonder if we want a new KPF_MTHP bit for mTHP and some
>>>> adjustment on tools/mm/thpmaps.
>>>
>>> It seems we have to do this though I think keeping KPF_THP and adding a
>>> separate bit like KPF_PMD_MAPPED makes more sense. but those tools
>>> relying on KPF_THP need to realize this and check the new bit , which is
>>> not done now.
>>> whether the mTHP's name is mTHP or THP will make no difference for
>>> this case:-)
>>
>> I don't quite follow your logic for that last part; If there are 2 separate
>> bits; KPF_THP and KPF_MTHP, and KPF_THP is only set for PMD-sized THP, that
>> would be a safe/compatible approach, right? Where as your suggestion requires
>> changes to existing tools to work.
>
> Right, my point is that mTHP and THP are both types of THP. The only difference
> is whether they are PMD-mapped or PTE-mapped. Adding a bit to describe how
> the page is mapped would more accurately reflect reality. However, this change
> would disrupt tools that assume KPF_THP always means PMD-mapped THP.
> Therefore, we would still need separate bits for THP and mTHP in this case.
I think perhaps PTE- vs PMD-mapped is a separate issue. The issue at hand is
whether PKF_THP implies a fixed size (and alignment). If compat is an issue,
then PKF_THP must continue to imply PMD-size. If compat is not an issue, then
size can be determined by iterating over the entries.
Having a mechanism to determine the level at which a block is mapped would
potentially be a useful feature, but seems orthogonal to me.
>
> I saw Willy complain about mTHP being called "mTHP," but in this case, calling
> it "mTHP" or just "THP" doesn't change anything if old tools continue to assume
> that KPF_THP means PMD-mapped THP.
I think Willy was just ribbing me because he preferred calling it "anonymous
large folios". That's how I took it anyway.
>
>>
>> Thinking about this a bit more, I wonder if PKF_MTHP is the right name for a new
>> flag; We don't currently expose the term "mTHP" to user space. I can't think of
>> a better name though.
>
> Yes. If "compatibility" is a requirement, we cannot disregard it.
>
>> I'd still like to understand what is actually broken that this change is fixing.
>> Is the concern that a user could see KPF_THP and advance forward by
>> "/sys/kernel/mm/transparent_hugepage/hpage_pmd_size / getpagesize()" entries?
>>
>
> Maybe we need an example which is thinking that KPF_THP is PMD-mapped.
Yes, that would help.
>
>>>
>>>>
>>>>
>>>> --
>>>> Best Regards,
>>>> Yan, Zi
>>>>
>>>
>
> Thanks
> Barry
Powered by blists - more mailing lists