[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <537D4547-383D-4AAF-9F9F-8A37B0BCB7BD@nvidia.com>
Date: Fri, 18 Apr 2025 09:25:57 -0400
From: Zi Yan <ziy@...dia.com>
To: Hugh Dickins <hughd@...gle.com>
Cc: Gavin Guo <gavinguo@...lia.com>, linux-mm@...ck.org,
akpm@...ux-foundation.org, willy@...radead.org, linmiaohe@...wei.com,
revest@...gle.com, david@...hat.com, kernel-dev@...lia.com,
linux-kernel@...r.kernel.org, Naoya Horiguchi <n-horiguchi@...jp.nec.com>
Subject: Re: [PATCH] mm/huge_memory: fix dereferencing invalid pmd migration
entry
On 17 Apr 2025, at 1:29, Hugh Dickins wrote:
> On Tue, 15 Apr 2025, Zi Yan wrote:
>>
>> Anyway, we need to figure out why both THP migration and deferred_split_scan()
>> hold the THP lock first, which sounds impossible to me. Or some other execution
>> interleaving is happening.
>
> I think perhaps you're missing that an anon_vma lookup points to a
> location which may contain the folio of interest, but might instead
> contain another folio: and weeding out those other folios is precisely
> what the "folio != pmd_folio((*pmd)" check (and the "risk of replacing
> the wrong folio" comment a few lines above it) is for.
Yes, from Gavin’s commit log, I thought both migration and deferred split
are working on the same folio. But after reread it along with your explanation,
now I understand that both are working on the same pmd migration entry.
Thank you for the explanation.
>
> The "BUG: unable to handle page fault" comes about because that other
> folio might actually be being migrated at this time, so we encounter
> a PMD migration entry instead of a valid PMD entry. But if it's the
> folio we're looking for, our folio lock excludes a racing migration,
> so it would never be a PMD migration entry for our folio.
>
> Hugh
Best Regards,
Yan, Zi
Powered by blists - more mailing lists