lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2666b1a8-12c0-475e-9aad-2fdc3e846e9a@arm.com>
Date: Sat, 7 Jun 2025 20:01:45 +0530
From: Dev Jain <dev.jain@....com>
To: Nico Pache <npache@...hat.com>
Cc: linux-mm@...ck.org, linux-doc@...r.kernel.org,
 linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
 david@...hat.com, ziy@...dia.com, baolin.wang@...ux.alibaba.com,
 lorenzo.stoakes@...cle.com, Liam.Howlett@...cle.com, ryan.roberts@....com,
 corbet@....net, rostedt@...dmis.org, mhiramat@...nel.org,
 mathieu.desnoyers@...icios.com, akpm@...ux-foundation.org,
 baohua@...nel.org, willy@...radead.org, peterx@...hat.com,
 wangkefeng.wang@...wei.com, usamaarif642@...il.com, sunnanyong@...wei.com,
 vishal.moola@...il.com, thomas.hellstrom@...ux.intel.com,
 yang@...amperecomputing.com, kirill.shutemov@...ux.intel.com,
 aarcange@...hat.com, raquini@...hat.com, anshuman.khandual@....com,
 catalin.marinas@....com, tiwai@...e.de, will@...nel.org,
 dave.hansen@...ux.intel.com, jack@...e.cz, cl@...two.org,
 jglisse@...gle.com, surenb@...gle.com, zokeefe@...gle.com,
 hannes@...xchg.org, rientjes@...gle.com, mhocko@...e.com,
 rdunlap@...radead.org
Subject: Re: [PATCH v7 07/12] khugepaged: add mTHP support


On 07/06/25 6:33 pm, Nico Pache wrote:
> On Sat, Jun 7, 2025 at 12:24 AM Dev Jain <dev.jain@....com> wrote:
>>
>> On 15/05/25 8:52 am, Nico Pache wrote:
>>> Introduce the ability for khugepaged to collapse to different mTHP sizes.
>>> While scanning PMD ranges for potential collapse candidates, keep track
>>> of pages in KHUGEPAGED_MIN_MTHP_ORDER chunks via a bitmap. Each bit
>>> represents a utilized region of order KHUGEPAGED_MIN_MTHP_ORDER ptes. If
>>> mTHPs are enabled we remove the restriction of max_ptes_none during the
>>> scan phase so we dont bailout early and miss potential mTHP candidates.
>>>
>>> After the scan is complete we will perform binary recursion on the
>>> bitmap to determine which mTHP size would be most efficient to collapse
>>> to. max_ptes_none will be scaled by the attempted collapse order to
>>> determine how full a THP must be to be eligible.
>>>
>>> If a mTHP collapse is attempted, but contains swapped out, or shared
>>> pages, we dont perform the collapse.
>>>
>>> For non PMD collapse we much leave the anon VMA write locked until after
>>> we collapse the mTHP
>> Why? I know that Hugh pointed out locking errors; I am yet to catch up
>> on that thread, but you need to explain in the description why you do
>> what you do.
>>
>> [--snip---]
>>
>>> -
>>> -     spin_lock(pmd_ptl);
>>> -     BUG_ON(!pmd_none(*pmd));
>>> -     folio_add_new_anon_rmap(folio, vma, address, RMAP_EXCLUSIVE);
>>> -     folio_add_lru_vma(folio, vma);
>>> -     pgtable_trans_huge_deposit(mm, pmd, pgtable);
>>> -     set_pmd_at(mm, address, pmd, _pmd);
>>> -     update_mmu_cache_pmd(vma, address, pmd);
>>> -     deferred_split_folio(folio, false);
>>> -     spin_unlock(pmd_ptl);
>>> +     if (order == HPAGE_PMD_ORDER) {
>>> +             pgtable = pmd_pgtable(_pmd);
>>> +             _pmd = folio_mk_pmd(folio, vma->vm_page_prot);
>>> +             _pmd = maybe_pmd_mkwrite(pmd_mkdirty(_pmd), vma);
>>> +
>>> +             spin_lock(pmd_ptl);
>>> +             BUG_ON(!pmd_none(*pmd));
>>> +             folio_add_new_anon_rmap(folio, vma, _address, RMAP_EXCLUSIVE);
>>> +             folio_add_lru_vma(folio, vma);
>>> +             pgtable_trans_huge_deposit(mm, pmd, pgtable);
>>> +             set_pmd_at(mm, address, pmd, _pmd);
>>> +             update_mmu_cache_pmd(vma, address, pmd);
>>> +             deferred_split_folio(folio, false);
>>> +             spin_unlock(pmd_ptl);
>>> +     } else { /* mTHP collapse */
>>> +             mthp_pte = mk_pte(&folio->page, vma->vm_page_prot);
>>> +             mthp_pte = maybe_mkwrite(pte_mkdirty(mthp_pte), vma);
>>> +
>>> +             spin_lock(pmd_ptl);
>> Nico,
>>
>> I've noticed a few occasions where my review comments have not been acknowledged -
>> for example, [1]. It makes it difficult to follow up and contributes to some
>> frustration on my end. I'd appreciate if you could make sure to respond to
>> feedback, even if you are disagreeing with my comments. Thanks!
> I'm sorry you feel that way, are there any others? I feel like I've
> been pretty good at responding to all comments. I've also been out of
> the office for the last month, so keeping up with upstream has been
> more difficult, but i'm back now.

No issues, there were others but I don't want to waste our time digging
them up, when we are on the same page!

>
> Sorry I never got back to you on that one! I will add the BUG_ON, but
> I believe it's unnecessary. Your changeset was focused on different
> functionality and it seems that you had a bug in it if you were
> hitting that often.

In my original reply, when I said "I hit the BUG_ON a lot of times",
I meant, during testing. It was quite difficult to extend for non-PMD
sized VMAs, and the BUG_ON was getting hit due to rmap reaching the
non-isolated folios and somehow installing the PMD again. That is
why I say that the BUG_ON is important since it will help us catch
bugs early. And we have that for the PMD case anyways so why not
for mTHP...

>
> Cheers,
> -- Nico
>>
>> [1] https://lore.kernel.org/all/08d13445-5ed1-42ea-8aee-c1dbde24407e@arm.com/
>>
>>
>> [---snip---]
>>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ