[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c65f7e9c-2e57-4fd5-973f-fc546c8c5827@redhat.com>
Date: Thu, 19 Dec 2024 13:58:06 +0100
From: David Hildenbrand <david@...hat.com>
To: Donet Tom <donettom@...ux.ibm.com>,
Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Cc: Ritesh Harjani <ritesh.list@...il.com>,
Baolin Wang <baolin.wang@...ux.alibaba.com>,
"Aneesh Kumar K . V" <aneesh.kumar@...nel.org>, Zi Yan <ziy@...dia.com>,
shuah Khan <shuah@...nel.org>, Dev Jain <dev.jain@....com>
Subject: Re: [PATCH] mm: migration :shared anonymous migration test is failing
On 19.12.24 13:47, Donet Tom wrote:
> The migration selftest is currently failing for shared anonymous
> mappings due to a race condition.
>
> During migration, the source folio's PTE is unmapped by nuking the
> PTE, flushing the TLB,and then marking the page for migration
> (by creating the swap entries). The issue arises when, immediately
> after the PTE is nuked and the TLB is flushed, but before the page
> is marked for migration, another thread accesses the page. This
> triggers a page fault, and the page fault handler invokes
> do_pte_missing() instead of do_swap_page(), as the page is not yet
> marked for migration.
>
> In the fault handling path, do_pte_missing() calls __do_fault()
> ->shmem_fault() -> shmem_get_folio_gfp() -> filemap_get_entry().
> This eventually calls folio_try_get(), incrementing the reference
> count of the folio undergoing migration. The thread then blocks
> on folio_lock(), as the migration path holds the lock. This
> results in the migration failing in __migrate_folio(), which expects
> the folio's reference count to be 2. However, the reference count is
> incremented by the fault handler, leading to the failure.
>
> The issue arises because, after nuking the PTE and before marking the
> page for migration, the page is accessed. To address this, we have
> updated the logic to first nuke the PTE, then mark the page for
> migration, and only then flush the TLB. With this patch, If the page is
> accessed immediately after nuking the PTE, the TLB entry is still
> valid, so no fault occurs.
But what about if the PTE is not in the TLB yet, and you get an access
from another CPU just after clearing the PTE (but not flushing the TLB)?
The other CPU will still observe PTE=none, trigger a fault etc.
So I don't think what you propose rules out all cases.
--
Cheers,
David / dhildenb
Powered by blists - more mailing lists