[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <1ae94fa5-8d68-4e4f-83e2-dfeefaab253a@arm.com>
Date: Sat, 7 Jun 2025 20:12:23 +0530
From: Dev Jain <dev.jain@....com>
To: Nico Pache <npache@...hat.com>
Cc: linux-mm@...ck.org, linux-doc@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
david@...hat.com, ziy@...dia.com, baolin.wang@...ux.alibaba.com,
lorenzo.stoakes@...cle.com, Liam.Howlett@...cle.com, ryan.roberts@....com,
corbet@....net, rostedt@...dmis.org, mhiramat@...nel.org,
mathieu.desnoyers@...icios.com, akpm@...ux-foundation.org,
baohua@...nel.org, willy@...radead.org, peterx@...hat.com,
wangkefeng.wang@...wei.com, usamaarif642@...il.com, sunnanyong@...wei.com,
vishal.moola@...il.com, thomas.hellstrom@...ux.intel.com,
yang@...amperecomputing.com, kirill.shutemov@...ux.intel.com,
aarcange@...hat.com, raquini@...hat.com, anshuman.khandual@....com,
catalin.marinas@....com, tiwai@...e.de, will@...nel.org,
dave.hansen@...ux.intel.com, jack@...e.cz, cl@...two.org,
jglisse@...gle.com, surenb@...gle.com, zokeefe@...gle.com,
hannes@...xchg.org, rientjes@...gle.com, mhocko@...e.com,
rdunlap@...radead.org
Subject: Re: [PATCH v7 07/12] khugepaged: add mTHP support
On 07/06/25 8:01 pm, Dev Jain wrote:
>
> On 07/06/25 6:33 pm, Nico Pache wrote:
>> On Sat, Jun 7, 2025 at 12:24 AM Dev Jain <dev.jain@....com> wrote:
>>>
>>> On 15/05/25 8:52 am, Nico Pache wrote:
>>>> Introduce the ability for khugepaged to collapse to different mTHP
>>>> sizes.
>>>> While scanning PMD ranges for potential collapse candidates, keep
>>>> track
>>>> of pages in KHUGEPAGED_MIN_MTHP_ORDER chunks via a bitmap. Each bit
>>>> represents a utilized region of order KHUGEPAGED_MIN_MTHP_ORDER
>>>> ptes. If
>>>> mTHPs are enabled we remove the restriction of max_ptes_none during
>>>> the
>>>> scan phase so we dont bailout early and miss potential mTHP
>>>> candidates.
>>>>
>>>> After the scan is complete we will perform binary recursion on the
>>>> bitmap to determine which mTHP size would be most efficient to
>>>> collapse
>>>> to. max_ptes_none will be scaled by the attempted collapse order to
>>>> determine how full a THP must be to be eligible.
>>>>
>>>> If a mTHP collapse is attempted, but contains swapped out, or shared
>>>> pages, we dont perform the collapse.
>>>>
>>>> For non PMD collapse we much leave the anon VMA write locked until
>>>> after
>>>> we collapse the mTHP
>>> Why? I know that Hugh pointed out locking errors; I am yet to catch up
>>> on that thread, but you need to explain in the description why you do
>>> what you do.
>>>
>>> [--snip---]
>>>
>>>> -
>>>> - spin_lock(pmd_ptl);
>>>> - BUG_ON(!pmd_none(*pmd));
>>>> - folio_add_new_anon_rmap(folio, vma, address, RMAP_EXCLUSIVE);
>>>> - folio_add_lru_vma(folio, vma);
>>>> - pgtable_trans_huge_deposit(mm, pmd, pgtable);
>>>> - set_pmd_at(mm, address, pmd, _pmd);
>>>> - update_mmu_cache_pmd(vma, address, pmd);
>>>> - deferred_split_folio(folio, false);
>>>> - spin_unlock(pmd_ptl);
>>>> + if (order == HPAGE_PMD_ORDER) {
>>>> + pgtable = pmd_pgtable(_pmd);
>>>> + _pmd = folio_mk_pmd(folio, vma->vm_page_prot);
>>>> + _pmd = maybe_pmd_mkwrite(pmd_mkdirty(_pmd), vma);
>>>> +
>>>> + spin_lock(pmd_ptl);
>>>> + BUG_ON(!pmd_none(*pmd));
>>>> + folio_add_new_anon_rmap(folio, vma, _address,
>>>> RMAP_EXCLUSIVE);
>>>> + folio_add_lru_vma(folio, vma);
>>>> + pgtable_trans_huge_deposit(mm, pmd, pgtable);
>>>> + set_pmd_at(mm, address, pmd, _pmd);
>>>> + update_mmu_cache_pmd(vma, address, pmd);
>>>> + deferred_split_folio(folio, false);
>>>> + spin_unlock(pmd_ptl);
>>>> + } else { /* mTHP collapse */
>>>> + mthp_pte = mk_pte(&folio->page, vma->vm_page_prot);
>>>> + mthp_pte = maybe_mkwrite(pte_mkdirty(mthp_pte), vma);
>>>> +
>>>> + spin_lock(pmd_ptl);
>>> Nico,
>>>
>>> I've noticed a few occasions where my review comments have not been
>>> acknowledged -
>>> for example, [1]. It makes it difficult to follow up and contributes
>>> to some
>>> frustration on my end. I'd appreciate if you could make sure to
>>> respond to
>>> feedback, even if you are disagreeing with my comments. Thanks!
>> I'm sorry you feel that way, are there any others? I feel like I've
>> been pretty good at responding to all comments. I've also been out of
>> the office for the last month, so keeping up with upstream has been
>> more difficult, but i'm back now.
>
> No issues, there were others but I don't want to waste our time digging
> them up, when we are on the same page!
To be clear, those others were when we were debating about your and my
method,
that is why I said it is a waste of time revisiting them :)
>
>>
>> Sorry I never got back to you on that one! I will add the BUG_ON, but
>> I believe it's unnecessary. Your changeset was focused on different
>> functionality and it seems that you had a bug in it if you were
>> hitting that often.
>
> In my original reply, when I said "I hit the BUG_ON a lot of times",
> I meant, during testing. It was quite difficult to extend for non-PMD
> sized VMAs, and the BUG_ON was getting hit due to rmap reaching the
> non-isolated folios and somehow installing the PMD again. That is
> why I say that the BUG_ON is important since it will help us catch
> bugs early. And we have that for the PMD case anyways so why not
> for mTHP...
>
>>
>> Cheers,
>> -- Nico
>>>
>>> [1]
>>> https://lore.kernel.org/all/08d13445-5ed1-42ea-8aee-c1dbde24407e@arm.com/
>>>
>>>
>>> [---snip---]
>>>
>>
>
Powered by blists - more mailing lists