[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3f8e9805-b90f-7df3-8514-139afa653671@bytedance.com>
Date: Tue, 31 Aug 2021 20:36:51 +0800
From: Qi Zheng <zhengqi.arch@...edance.com>
To: David Hildenbrand <david@...hat.com>, akpm@...ux-foundation.org,
tglx@...utronix.de, hannes@...xchg.org, mhocko@...nel.org,
vdavydov.dev@...il.com, kirill.shutemov@...ux.intel.com,
mika.penttila@...tfour.com
Cc: linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, songmuchun@...edance.com
Subject: Re: [PATCH v1 2/2] mm: remove redundant smp_wmb()
On 2021/8/31 PM6:02, David Hildenbrand wrote:
> On 28.08.21 06:23, Qi Zheng wrote:
>> The smp_wmb() which is in the __pte_alloc() is used to
>> ensure all ptes setup is visible before the pte is made
>> visible to other CPUs by being put into page tables. We
>> only need this when the pte is actually populated, so
>> move it to pte_install(). __pte_alloc_kernel(),
>> __p4d_alloc(), __pud_alloc() and __pmd_alloc() are similar
>> to this case.
>>
>> We can also defer smp_wmb() to the place where the pmd entry
>> is really populated by preallocated pte. There are two kinds
>> of user of preallocated pte, one is filemap & finish_fault(),
>> another is THP. The former does not need another smp_wmb()
>> because the smp_wmb() has been done by pte_install().
>> Fortunately, the latter also does not need another smp_wmb()
>> because there is already a smp_wmb() before populating the
>> new pte when the THP uses a preallocated pte to split a huge
>> pmd.
>>
>> Signed-off-by: Qi Zheng <zhengqi.arch@...edance.com>
>> Reviewed-by: Muchun Song <songmuchun@...edance.com>
>> ---
>> mm/memory.c | 47
>> ++++++++++++++++++++---------------------------
>> mm/sparse-vmemmap.c | 2 +-
>> 2 files changed, 21 insertions(+), 28 deletions(-)
>>
>> diff --git a/mm/memory.c b/mm/memory.c
>> index ef7b1762e996..9c7534187454 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -439,6 +439,20 @@ void pmd_install(struct mm_struct *mm, pmd_t
>> *pmd, pgtable_t *pte)
>> if (likely(pmd_none(*pmd))) { /* Has another populated it ? */
>> mm_inc_nr_ptes(mm);
>> + /*
>> + * Ensure all pte setup (eg. pte page lock and page clearing)
>> are
>> + * visible before the pte is made visible to other CPUs by being
>> + * put into page tables.
>> + *
>> + * The other side of the story is the pointer chasing in the
>> page
>> + * table walking code (when walking the page table without
>> locking;
>> + * ie. most of the time). Fortunately, these data accesses
>> consist
>> + * of a chain of data-dependent loads, meaning most CPUs (alpha
>> + * being the notable exception) will already guarantee loads are
>> + * seen in-order. See the alpha page table accessors for the
>> + * smp_rmb() barriers in page table walking code.
>> + */
>> + smp_wmb(); /* Could be smp_wmb__xxx(before|after)_spin_lock */
>> pmd_populate(mm, pmd, *pte);
>> *pte = NULL;
>> }
>> @@ -451,21 +465,6 @@ int __pte_alloc(struct mm_struct *mm, pmd_t *pmd)
>> if (!new)
>> return -ENOMEM;
>> - /*
>> - * Ensure all pte setup (eg. pte page lock and page clearing) are
>> - * visible before the pte is made visible to other CPUs by being
>> - * put into page tables.
>> - *
>> - * The other side of the story is the pointer chasing in the page
>> - * table walking code (when walking the page table without locking;
>> - * ie. most of the time). Fortunately, these data accesses consist
>> - * of a chain of data-dependent loads, meaning most CPUs (alpha
>> - * being the notable exception) will already guarantee loads are
>> - * seen in-order. See the alpha page table accessors for the
>> - * smp_rmb() barriers in page table walking code.
>> - */
>> - smp_wmb(); /* Could be smp_wmb__xxx(before|after)_spin_lock */
>> -
>> pmd_install(mm, pmd, &new);
>> if (new)
>> pte_free(mm, new);
>> @@ -478,10 +477,9 @@ int __pte_alloc_kernel(pmd_t *pmd)
>> if (!new)
>> return -ENOMEM;
>> - smp_wmb(); /* See comment in __pte_alloc */
>> -
>> spin_lock(&init_mm.page_table_lock);
>> if (likely(pmd_none(*pmd))) { /* Has another populated it ? */
>> + smp_wmb(); /* See comment in pmd_install() */
>> pmd_populate_kernel(&init_mm, pmd, new);
>> new = NULL;
>> }
>> @@ -3857,7 +3855,6 @@ static vm_fault_t __do_fault(struct vm_fault *vmf)
>> vmf->prealloc_pte = pte_alloc_one(vma->vm_mm);
>> if (!vmf->prealloc_pte)
>> return VM_FAULT_OOM;
>> - smp_wmb(); /* See comment in __pte_alloc() */
>> }
>> ret = vma->vm_ops->fault(vmf);
>> @@ -3919,7 +3916,6 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf,
>> struct page *page)
>> vmf->prealloc_pte = pte_alloc_one(vma->vm_mm);
>> if (!vmf->prealloc_pte)
>> return VM_FAULT_OOM;
>> - smp_wmb(); /* See comment in __pte_alloc() */
>> }
>> vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd);
>> @@ -4144,7 +4140,6 @@ static vm_fault_t do_fault_around(struct
>> vm_fault *vmf)
>> vmf->prealloc_pte = pte_alloc_one(vmf->vma->vm_mm);
>> if (!vmf->prealloc_pte)
>> return VM_FAULT_OOM;
>> - smp_wmb(); /* See comment in __pte_alloc() */
>> }
>> return vmf->vma->vm_ops->map_pages(vmf, start_pgoff, end_pgoff);
>> @@ -4819,13 +4814,13 @@ int __p4d_alloc(struct mm_struct *mm, pgd_t
>> *pgd, unsigned long address)
>> if (!new)
>> return -ENOMEM;
>> - smp_wmb(); /* See comment in __pte_alloc */
>> -
>> spin_lock(&mm->page_table_lock);
>> if (pgd_present(*pgd)) /* Another has populated it */
>> p4d_free(mm, new);
>> - else
>> + else {
>> + smp_wmb(); /* See comment in pmd_install() */
>> pgd_populate(mm, pgd, new);
>> + }
>
> Nit:
>
> if () {
>
> } else {
>
> }
>
> see Documentation/process/coding-style.rst
>
> "This does not apply if only one branch of a conditional statement is a
> single statement; in the latter case use braces in both branches:"
Got it.
>
>
> Apart from that, I think this is fine,
>
> Acked-by: David Hildenbrand <david@...hat.com>
>
Thanks,
Qi
Powered by blists - more mailing lists