[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <8dd76501-9e19-4471-86e2-6c4254fd8d42@linux.dev>
Date: Wed, 15 Oct 2025 17:31:15 +0800
From: Lance Yang <lance.yang@...ux.dev>
To: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
Cc: David Hildenbrand <david@...hat.com>, akpm@...ux-foundation.org,
ziy@...dia.com, baolin.wang@...ux.alibaba.com, Liam.Howlett@...cle.com,
npache@...hat.com, ryan.roberts@....com, dev.jain@....com,
baohua@...nel.org, ioworker0@...il.com, richard.weiyang@...il.com,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH mm-new v3 1/3] mm/khugepaged: optimize PTE scanning with
if-else-if-else-if chain
On 2025/10/15 17:16, Lorenzo Stoakes wrote:
> On Wed, Oct 15, 2025 at 12:49:26PM +0800, Lance Yang wrote:
>>
>>
>> On 2025/10/14 20:27, David Hildenbrand wrote:
>>> On 14.10.25 14:17, Lorenzo Stoakes wrote:
>>>> On Wed, Oct 08, 2025 at 12:37:46PM +0800, Lance Yang wrote:
>>>>> From: Lance Yang <lance.yang@...ux.dev>
>>>>>
>>>>> As pointed out by Dev, the PTE checks for disjoint conditions in the
>>>>> scanning loops can be optimized. is_swap_pte, (pte_none && is_zero_pfn),
>>>>> and pte_uffd_wp are mutually exclusive.
>>>>
>>>> But you're not using is_swap_pte anywhere :) This comes back to my review
>>>> quesiotn on the series this is dependent upon.
>>>>
>>>>>
>>>>> This patch refactors the loops in both
>>>>> __collapse_huge_page_isolate() and
>>>>> hpage_collapse_scan_pmd() to use a continuous if-else-if-else-if chain
>>>>> instead of separate if blocks. While at it, the redundant pte_present()
>>>>> check before is_zero_pfn() is also removed.
>>>>
>>>> I mean see review below, I don't see why you're doing this and I am
>>>> unconvinced by how redundant that check is.
>>
>> Ah, good catch! Lorenzo, thanks!!!
>>
>>>>
>>>> Also this just feels like it should be part of the series where you
>>>> change
>>>> these? I'm not sure why this is separate?
>>>
>>> I think Lance is trying to unify both scanning functions to look alike,
>>> such that when he refactors them out in patch #3 it looks more straight
>>> forward.
>>>
>>> The missing pte_present() check in hpage_collapse_scan_pmd() is interesting
>>
>> Yep, indeed ;)
>>
>>>
>>> Likely there is one such check missing there?
>>
>> I think the risk is exactly how pte_pfn() would handle a swap PTE ...
>>
>> A swap PTE contains completely different data(swap type and offset).
>> pte_pfn() doesn't know this, so if we feed a swap entry to it, it will
>> spit out a junk PFN :)
>>
>> What if that junk PFN happens to match the zeropage's PFN by sheer
>> chance? IHMO, it's really unlikely, but it would be really bad if it did.
>>
>> Clearly, pte_present() prevents this ;)
>
> Yeah, not so clearly kinda the whole point I'm making here. I mean all this code
> sucks because we have deeply nested conditions and you have to keep in your mind
> that 'oh we already checked for X so we can do Y'.
I see :)
>
> But the THP code is horrible in general.
Yep, you'll get no argument from me on that one ;p
The code is indeed tricky ...
>
> Anyway let's stay focused (so I can stand a chance of catching up with my
> post-vacation backlog :), I will check the respin once you send!
Cheers!
This series has been dropped from mm-new. I'm going to take a bit to
rethink the approach based on the feedback.
>
>>
>> By the way, I noticed there are other places in khugepaged.c that
>> call pte_pfn() without being under a pte_present() check.
>>
>> Perhaps those should all be fixed as well in a separate patch?
>
> Yeah I'm not surprised and sure indeed.
>
> Thanks, Lorenzo
In the meantime, I'll send up a separate patch for the missing
pte_present() checks first ;)
Thanks,
Lance
Powered by blists - more mailing lists