[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <ea15f3d3-5dd8-4404-8dab-5673bb5f3413@arm.com>
Date: Thu, 5 Dec 2024 15:40:08 +0530
From: Dev Jain <dev.jain@....com>
To: ryan.roberts@....com, david@...hat.com, kirill.shutemov@...ux.intel.com,
willy@...radead.org, ziy@...dia.com, linux-kernel@...r.kernel.org,
linux-mm@...ck.org
Subject: Re: [QUESTION] anon_vma lock in khugepaged
On 28/11/24 11:56 am, Dev Jain wrote:
> Hi, I was looking at khugepaged code and I cannot figure out what will the problem be
> if we take the mmap lock in read mode. Shouldn't just taking the PMD lock, then PTL,
> then unlocking PTL, then unlocking PMD, solve any races with page table walkers?
>
>
Similar questions:
1. Why do we need anon_vma_lock_write() in collapse_huge_page()? AFAIK we need to walk anon_vma's either
when we are forking or when we are unmapping a folio and need to find all VMAs mapping it; the former path takes the
mmap_write_lock() and so we have no problem, and for the latter, if we just had anon_vma_lock_read(), then it
may happen that kswapd isolates folio from LRU, and traverses rmap and swaps the folio out and khugepaged fails in
folio_isolate_lru(), but then that is not a fatal problem but just a performance degradation due to a race (wherein
the entire code is racy anyways). What am I missing?
2. In what all scenarios does rmap come into play? Fork, swapping out, any other I am missing?
3. Please confirm the correctness: In stark contrast to page migration, we do not need to do rmap walk and nuke all
PTEs referencing the folio, because for anon non-shmem folios, the only way the folio can be shared is forking,
and, if that is the case, folio_put() will not release the folio in __collapse_huge_page_copy_succeeded() -> free_page_and_swap_cache(),
so the old folio is still there and child processes can read from it. Page migration requires that we are able
to deallocate the old folios.
Powered by blists - more mailing lists