[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a3986998-1f63-4e8c-94ed-c6d1004bf111@bytedance.com>
Date: Fri, 30 Aug 2024 14:37:44 +0800
From: Qi Zheng <zhengqi.arch@...edance.com>
To: David Hildenbrand <david@...hat.com>
Cc: muchun.song@...ux.dev, hughd@...gle.com, willy@...radead.org,
vbabka@...nel.org, akpm@...ux-foundation.org, rppt@...nel.org,
vishal.moola@...il.com, peterx@...hat.com, ryan.roberts@....com,
christophe.leroy2@...soprasteria.com, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, linux-arm-kernel@...ts.infradead.org,
linuxppc-dev@...ts.ozlabs.org
Subject: Re: [PATCH v2 01/14] mm: pgtable: introduce
pte_offset_map_{ro|rw}_nolock()
On 2024/8/29 23:31, David Hildenbrand wrote:
> On 29.08.24 12:59, Qi Zheng wrote:
>>
>>
>> On 2024/8/28 18:48, David Hildenbrand wrote:
>>> On 27.08.24 06:33, Qi Zheng wrote:
>>
>> [...]
>>
>>>> sufficient AFAIUK.
>>>
>>> Drop the "AFAIUK" :)
>>>
>>> "For R/O access this is sufficient."
>>>
>>>>
>>>> pte_offset_map_rw_nolock(mm, pmd, addr, pmdvalp, ptlp), above, is like
>>>> pte_offset_map_ro_nolock(); but when successful, it also outputs the
>>>> pdmval. For R/W access, the callers can not accept that the page table
>>>> it sees has been unmapped and is about to get freed. The pmdval can
>>>> help
>>>> callers to recheck pmd_same() to identify this case once the
>>>> spinlock is
>>>> taken. For some cases where exclusivity is already guaranteed, such as
>>>> holding the write lock of mmap_lock, or in cases where checking is
>>>> sufficient, such as a !pte_none() pte will be rechecked after the
>>>> spinlock is taken, there is no need to recheck pdmval.
>>>
>>> Right, using pte_same() one can achieve a similar result, assuming that
>>> the freed page table gets all ptes set to pte_none().
>>>
>>> page_table_check_pte_clear_range() before pte_free_defer() in
>>> retract_page_tables/collapse_pte_mapped_thp() sanity checks that I
>>> think.
>>
>> Since commit 1d65b771bc08, retract_page_tables() only holds the
>> i_mmap_lock_read(mapping) but not mmap_lock, so it seems that
>> holding the write lock of mmap_lock cannot guarantee the stability
>> of the PTE page.
>
> Guess it depends. khugepaged on anonymous memory will block any page
> table walkers (like anon THP collapse does) -- per-VMA lock, mmap lock,
> mapping lock/RMAP lock ... so it *should* be sufficient to hold any of
> these, right?
retract_page_tables() itself is safe, but because it does not hold the
read lock of mmap_lock, other paths that only hold the write lock of
mmap_lock may be concurrent with it, such as the paths in
[PATCH v2 08/14] and [PATCH v2 09/14].
>
> So at least for now, these (anonymous memory) cases would be ok. Likely
> that will change when reclaiming empty page tables.
When reclaiming the empty page tables, I will hold the read lock of
mmap_lock.
Therefore, either perform a pmd_same() check on the case where the
write lock of mmap_lock is held, or add the read lock of mmap_lock
to retract_page_tables() as well.
>
>>
>> IIUC, I will also perform a pmd_same() check on the case where the
>> write lock of mmap_lock is held in v3. Or do I miss something?
>
> Can you spell out the instances where you think it might be required.
For example, the paths in [PATCH v2 08/14] and [PATCH v2 09/14] need
to do pmd_same() check after holding the PTL.
>
Powered by blists - more mailing lists