linux-kernel - Re: [PATCH v4 2/4] mm/memory.c: Update local TLB if PTE entry exists

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e9cd1d61-c475-9b13-fd48-3ff886c74797@loongson.cn>
Date:   Wed, 20 May 2020 14:39:13 +0800
From:   maobibo <maobibo@...ngson.cn>
To:     Andrew Morton <akpm@...ux-foundation.org>
Cc:     Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
        Jiaxun Yang <jiaxun.yang@...goat.com>,
        Huacai Chen <chenhc@...ote.com>,
        Paul Burton <paulburton@...nel.org>,
        Dmitry Korotin <dkorotin@...ecomp.com>,
        Philippe Mathieu-Daudé <f4bug@...at.org>,
        Stafford Horne <shorne@...il.com>,
        Steven Price <steven.price@....com>,
        Anshuman Khandual <anshuman.khandual@....com>,
        linux-mips@...r.kernel.org, linux-kernel@...r.kernel.org,
        Mike Rapoport <rppt@...ux.ibm.com>,
        Sergei Shtylyov <sergei.shtylyov@...entembedded.com>,
        "Maciej W. Rozycki" <macro@....com>, linux-mm@...ck.org,
        David Hildenbrand <david@...hat.com>
Subject: Re: [PATCH v4 2/4] mm/memory.c: Update local TLB if PTE entry exists



On 05/20/2020 09:26 AM, Andrew Morton wrote:
> On Tue, 19 May 2020 18:03:28 +0800 Bibo Mao <maobibo@...ngson.cn> wrote:
> 
>> If two threads concurrently fault at the same address, the thread that
>> won the race updates the PTE and its local TLB. For now, the other
>> thread gives up, simply does nothing, and continues.
>>
>> It could happen that this second thread triggers another fault, whereby
>> it only updates its local TLB while handling the fault. Instead of
>> triggering another fault, let's directly update the local TLB of the
>> second thread.
>>
>> It is only useful to architectures where software can update TLB, it may
>> bring out some negative effect if update_mmu_cache is used for other
>> purpose also. It seldom happens where multiple threads access the same
>> page at the same time, so the negative effect is limited on other arches.
> 
> I'm still worried about the impact on other architectures.  The
> additional update_mmu_cache() calls won't occur only when multiple
> threads are racing against the same page, I think?  For example,
> insert_pfn() will do this when making a read-only page a writable one.
How about defining ptep_set_access_flags function like this on mips system?
which is the same on riscv platform.

static inline int ptep_set_access_flags(struct vm_area_struct *vma,
					unsigned long address, pte_t *ptep,
					pte_t entry, int dirty)
{
	if (!pte_same(*ptep, entry))
		set_pte_at(vma->vm_mm, address, ptep, entry);
	/*
	 * update_mmu_cache will unconditionally execute, handling both
	 * the case that the PTE changed and the spurious fault case.
	 */
	return true;
}

And keep the following piece of code unchanged, the change will be smaller.
@@ -1770,8 +1770,8 @@ static vm_fault_t insert_pfn(struct vm_area_struct *vma, unsigned long addr,
                        }
                        entry = pte_mkyoung(*pte);
                        entry = maybe_mkwrite(pte_mkdirty(entry), vma);
-                       if (ptep_set_access_flags(vma, addr, pte, entry, 1))
-                               update_mmu_cache(vma, addr, pte);
+                       ptep_set_access_flags(vma, addr, pte, entry, 1);
+                       update_mmu_cache(vma, addr, pte);
                }

@@ -2436,17 +2436,16 @@ static inline bool cow_user_page(struct page *dst, struct page *src,
                entry = pte_mkyoung(vmf->orig_pte);
-               if (ptep_set_access_flags(vma, addr, vmf->pte, entry, 0))
-                       update_mmu_cache(vma, addr, vmf->pte);
+               ptep_set_access_flags(vma, addr, vmf->pte, entry, 0);
+               update_mmu_cache(vma, addr, vmf->pte);
        }
@@ -2618,8 +2618,8 @@ static inline void wp_page_reuse(struct vm_fault *vmf)
        flush_cache_page(vma, vmf->address, pte_pfn(vmf->orig_pte));
        entry = pte_mkyoung(vmf->orig_pte);
        entry = maybe_mkwrite(pte_mkdirty(entry), vma);
-       if (ptep_set_access_flags(vma, vmf->address, vmf->pte, entry, 1))
-               update_mmu_cache(vma, vmf->address, vmf->pte);
+       ptep_set_access_flags(vma, vmf->address, vmf->pte, entry, 1);
+       update_mmu_cache(vma, vmf->address, vmf->pte);
        pte_unmap_unlock(vmf->pte, vmf->ptl);
 }



> 
> Would you have time to add some instrumentation into update_mmu_cache()
> (maybe a tracepoint) and see what effect this change has upon the
> frequency at which update_mmu_cache() is called for a selection of
> workloads?  And add this info to the changelog to set minds at ease?
OK, I will add some instrumentation data in the changelog.