[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <30469615-2DDC-467E-A810-5EE8E1CFCB43@nvidia.com>
Date: Wed, 08 May 2024 12:22:08 -0400
From: Zi Yan <ziy@...dia.com>
To: Jason Gunthorpe <jgg@...dia.com>
Cc: Lance Yang <ioworker0@...il.com>, Alistair Popple <apopple@...dia.com>,
akpm@...ux-foundation.org, willy@...radead.org, sj@...nel.org,
maskray@...gle.com, ryan.roberts@....com, david@...hat.com,
21cnbao@...il.com, mhocko@...e.com, fengwei.yin@...el.com,
zokeefe@...gle.com, shy828301@...il.com, xiehuan09@...il.com,
libang.li@...group.com, wangkefeng.wang@...wei.com, songmuchun@...edance.com,
peterx@...hat.com, minchan@...nel.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, Baolin Wang <baolin.wang@...ux.alibaba.com>
Subject: Re: [PATCH v4 2/3] mm/rmap: integrate PMD-mapped folio splitting into
pagewalk loop
On 8 May 2024, at 11:52, Jason Gunthorpe wrote:
> On Wed, May 08, 2024 at 10:56:34AM -0400, Zi Yan wrote:
>
>> Lance is improving try_to_unmap_one() to support unmapping PMD THP as a whole,
>> so he moves split_huge_pmd_address() inside while (page_vma_mapped_walk(&pvmw))
>> and after mmu_notifier_invalidate_range_start() as split_huge_pmd_locked()
>> and does not include the mmu notifier ops inside split_huge_pmd_address().
>> I wonder if that could cause issues, since the mmu_notifier_invalidate_range_start()
>> before the while loop only has range of the original address and
>> split huge pmd can affect the entire PMD address range and these two ranges
>> might not be the same.
>
> That does not sound entirely good..
>
> I suppose it depends on what split does, if the MM page table has the
> same translation before and after split then perhaps no invalidation
> is even necessary.
Before split, it is a PMD mapping to a PMD THP (order-9). After split,
they are 512 PTEs mapping to the same THP. Unless the secondary TLB
does not support PMD mapping and use 512 PTEs instead, it seems to
be an issue from my understanding.
In terms of two mmu_notifier ranges, first is in the split_huge_pmd_address()[1]
and second is in try_to_unmap_one()[2]. When try_to_unmap_one() is unmapping
a subpage in the middle of a PMD THP, the former notifies about the PMD range
change due to one PMD split into 512 PTEs and the latter only needs to notify
about the invalidation of the unmapped PTE. I do not think the latter can
replace the former, although a potential optimization can be that the latter
can be removed as it is included in the range of the former.
Regarding Lance's current code change, is it OK to change mmu_notifier range
after mmu_notifier_invalidate_range_start()? Since in Lance's code, first
mmu_notifier is gone and second, whose range is only about the single PTE,
starts mmu_notifier_invalidate_range_start(), then the code finds that
a PMD needs to be split into 512 PTEs. Would changing the range from PTE
to PMD suffice? Or the code should call mmu_notifier_invalidate_range_start()
with the new PMD range? I am not even sure two start with one end is legit.
[1] https://elixir.bootlin.com/linux/v6.9-rc7/source/mm/huge_memory.c#L2658
[2] https://elixir.bootlin.com/linux/v6.9-rc7/source/mm/rmap.c#L1650
--
Best Regards,
Yan, Zi
Download attachment "signature.asc" of type "application/pgp-signature" (855 bytes)
Powered by blists - more mailing lists