lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3e8253c4-9181-4027-84ee-28e1fc488f61@bytedance.com>
Date: Mon, 12 Aug 2024 14:21:04 +0800
From: Qi Zheng <zhengqi.arch@...edance.com>
To: David Hildenbrand <david@...hat.com>
Cc: hughd@...gle.com, willy@...radead.org, mgorman@...e.de,
 muchun.song@...ux.dev, vbabka@...nel.org, akpm@...ux-foundation.org,
 zokeefe@...gle.com, rientjes@...gle.com, linux-mm@...ck.org,
 linux-kernel@...r.kernel.org, the arch/x86 maintainers <x86@...nel.org>
Subject: Re: [RFC PATCH v2 1/7] mm: pgtable: make pte_offset_map_nolock()
 return pmdval

Hi David,

On 2024/8/10 00:54, David Hildenbrand wrote:
> On 07.08.24 05:08, Qi Zheng wrote:
>> Hi David,
>>
>> On 2024/8/6 22:16, David Hildenbrand wrote:
>>> On 06.08.24 04:40, Qi Zheng wrote:
>>>> Hi David,
>>>>
>>>> On 2024/8/5 22:43, David Hildenbrand wrote:
>>>>> On 05.08.24 14:55, Qi Zheng wrote:
>>>>>> Make pte_offset_map_nolock() return pmdval so that we can recheck the
>>>>>> *pmd once the lock is taken. This is a preparation for freeing empty
>>>>>> PTE pages, no functional changes are expected.
>>>>>
>>>>> Skimming the patches, only patch #4 updates one of the callsites
>>>>> (collapse_pte_mapped_thp).
>>>>
>>>> In addition, retract_page_tables() and reclaim_pgtables_pmd_entry()
>>>> also used the pmdval returned by pte_offset_map_nolock().
>>>
>>> Right, and I am questioning if only touching these two is sufficient,
>>> and how we can make it clearer when someone actually has to recheck the
>>> PMD.
>>>
>>>>
>>>>>
>>>>> Wouldn't we have to recheck if the PMD val changed in more cases after
>>>>> taking the PTL?
>>>>>
>>>>> If not, would it make sense to have a separate function that 
>>>>> returns the
>>>>> pmdval and we won't have to update each and every callsite?
>>>>
>>>> pte_offset_map_nolock() had already obtained the pmdval previously, 
>>>> just
>>>> hadn't returned it. And updating those callsite is simple, so I think
>>>> there may not be a need to add a separate function.
>>>
>>> Let me ask this way: why is retract_page_tables() and
>>> reclaim_pgtables_pmd_entry() different to the other ones, and how would
>>> someone using pte_offset_map_nolock() know what's to do here?
>>
>> If we acuqire the PTL (PTE or PMD lock) after calling
>> pte_offset_map_nolock(), it means we may be modifying the corresponding
>> pte or pmd entry. In that case, we need to perform a pmd_same() check
>> after holding the PTL, just like in pte_offset_map_lock(), to prevent
>> the possibility of the PTE page being reclaimed at that time.
> 
> Okay, what I thought.
> 
>>
>> If we call pte_offset_map_nolock() and do not need to acquire the PTL
>> afterwards, it means we are only reading the PTE page. In this case, the
>> rcu_read_lock() in pte_offset_map_nolock() will ensure that the PTE page
>> cannot be reclaimed.
>>
>>>
>>> IIUC, we must check the PMDVAL after taking the PTL in case
>>>
>>> (a) we want to modify the page table to turn pte_none() entries to
>>>       !pte_none(). Because it could be that the page table was 
>>> removed and
>>>       now is all pte_none()
>>>
>>> (b) we want to remove the page table ourselves and want to check if it
>>>       has already been removed.
>>>
>>> Is that it?
>>
>> Yes.
>>
>>>
>>> So my thinking is if another function variant can make that clearer.
>>
>> OK, how about naming it pte_offset_map_before_lock?
> 
> That's the issue with some of the code: for example in 
> filemap_fault_recheck_pte_none() we'll call pte_offset_map_nolock() and 
> conditionally take the PTL. But we won't be modifying the pages tables.
> 
> Maybe something like:
> 
> pte_offset_map_readonly_nolock()
> 
> and
> 
> pte_offset_map_maywrite_nolock()
> 
> The latter would require you to pass the PMD pointer such that you have 
> to really mess up to ignore what to do with it (check PMD same or not 
> check PMD same if you really know what you are douing).
> 
> The first would not take a PMD pointer at all, because there is no need to.

These two function names LGTM. Will do in the next version.

Thanks,
Qi

> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ