linux-kernel - Re: [PATCH 2/2] mm: rmap: Move the cache flushing to the correct place for hugetlb PMD sharing

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <82632a98-e7e8-cf04-ea5c-f8c804184af8@linux.alibaba.com>
Date:   Tue, 26 Apr 2022 14:26:24 +0800
From:   Baolin Wang <baolin.wang@...ux.alibaba.com>
To:     Mike Kravetz <mike.kravetz@...cle.com>, akpm@...ux-foundation.org
Cc:     almasrymina@...gle.com, songmuchun@...edance.com,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] mm: rmap: Move the cache flushing to the correct
 place for hugetlb PMD sharing



On 4/26/2022 8:20 AM, Mike Kravetz wrote:
> On 4/24/22 07:50, Baolin Wang wrote:
>> The cache level flush will always be first when changing an existing
>> virtual–>physical mapping to a new value, since this allows us to
>> properly handle systems whose caches are strict and require a
>> virtual–>physical translation to exist for a virtual address. So we
>> should move the cache flushing before huge_pmd_unshare().
>>
>> As Muchun pointed out[1], now the architectures whose supporting hugetlb
>> PMD sharing have no cache flush issues in practice. But I think we
>> should still follow the cache/TLB flushing rules when changing a valid
>> virtual address mapping in case of potential issues in future.
>>
>> [1] https://lore.kernel.org/all/YmT%2F%2FhuUbFX+KHcy@FVFYT0MHHV2J.usts.net/
>> Signed-off-by: Baolin Wang <baolin.wang@...ux.alibaba.com>
>> ---
>>   mm/rmap.c | 40 ++++++++++++++++++++++------------------
>>   1 file changed, 22 insertions(+), 18 deletions(-)
>>
>> diff --git a/mm/rmap.c b/mm/rmap.c
>> index 61e63db..81872bb 100644
>> --- a/mm/rmap.c
>> +++ b/mm/rmap.c
>> @@ -1535,15 +1535,16 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
>>   			 * do this outside rmap routines.
>>   			 */
>>   			VM_BUG_ON(!(flags & TTU_RMAP_LOCKED));
>> +			/*
>> +			 * huge_pmd_unshare unmapped an entire PMD page.
> 
> Perhaps update this comment to say that huge_pmd_unshare 'may' unmap
> an entire PMD page?

Sure, will do.

> 
>> +			 * There is no way of knowing exactly which PMDs may
>> +			 * be cached for this mm, so we must flush them all.
>> +			 * start/end were already adjusted above to cover this
>> +			 * range.
>> +			 */
>> +			flush_cache_range(vma, range.start, range.end);
>> +
>>   			if (huge_pmd_unshare(mm, vma, &address, pvmw.pte)) {
>> -				/*
>> -				 * huge_pmd_unshare unmapped an entire PMD
>> -				 * page.  There is no way of knowing exactly
>> -				 * which PMDs may be cached for this mm, so
>> -				 * we must flush them all.  start/end were
>> -				 * already adjusted above to cover this range.
>> -				 */
>> -				flush_cache_range(vma, range.start, range.end);
>>   				flush_tlb_range(vma, range.start, range.end);
>>   				mmu_notifier_invalidate_range(mm, range.start,
>>   							      range.end);
>> @@ -1560,13 +1561,14 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
>>   				page_vma_mapped_walk_done(&pvmw);
>>   				break;
>>   			}
>> +		} else {
>> +			flush_cache_page(vma, address, pte_pfn(*pvmw.pte));
> 
> I know this call to flush_cache_page() existed before your change.  But, when
> looking at this now I wonder how hugetlb pages are handled?  Are there any
> versions of flush_cache_page() that take page size into account?

Thanks for reminding. I checked the flush_cache_page() implementation on 
some architectures (like arm32), they did not consider the hugetlb 
pages, so I think we may miss flushing the whole cache for hguetlb pages 
on some architectures.

With this patch, we can mitigate this issue, since we change to use 
flush_cache_range() to cover the possible range to flush cache for 
hugetlb pages. Bur for anon hugetlb pages, we should also convert to use
flush_cache_range() instead. I think we can do this conversion in a 
separate patch set with checking all the places, where using 
flush_cache_page() to flush cache for hugetlb pages. How do you think?