[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87msztbiy8.fsf@linux.ibm.com>
Date: Tue, 18 Jul 2023 16:11:19 +0530
From: "Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com>
To: Hugh Dickins <hughd@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>
Cc: Mike Kravetz <mike.kravetz@...cle.com>,
Mike Rapoport <rppt@...nel.org>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Matthew Wilcox <willy@...radead.org>,
David Hildenbrand <david@...hat.com>,
Suren Baghdasaryan <surenb@...gle.com>,
Qi Zheng <zhengqi.arch@...edance.com>,
Yang Shi <shy828301@...il.com>,
Mel Gorman <mgorman@...hsingularity.net>,
Peter Xu <peterx@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Will Deacon <will@...nel.org>, Yu Zhao <yuzhao@...gle.com>,
Alistair Popple <apopple@...dia.com>,
Ralph Campbell <rcampbell@...dia.com>,
Ira Weiny <ira.weiny@...el.com>,
Steven Price <steven.price@....com>,
SeongJae Park <sj@...nel.org>,
Lorenzo Stoakes <lstoakes@...il.com>,
Huang Ying <ying.huang@...el.com>,
Naoya Horiguchi <naoya.horiguchi@....com>,
Christophe Leroy <christophe.leroy@...roup.eu>,
Zack Rusin <zackr@...are.com>, Jason Gunthorpe <jgg@...pe.ca>,
Axel Rasmussen <axelrasmussen@...gle.com>,
Anshuman Khandual <anshuman.khandual@....com>,
Pasha Tatashin <pasha.tatashin@...een.com>,
Miaohe Lin <linmiaohe@...wei.com>,
Minchan Kim <minchan@...nel.org>,
Christoph Hellwig <hch@...radead.org>,
Song Liu <song@...nel.org>,
Thomas Hellstrom <thomas.hellstrom@...ux.intel.com>,
Russell King <linux@...linux.org.uk>,
"David S. Miller" <davem@...emloft.net>,
Michael Ellerman <mpe@...erman.id.au>,
Heiko Carstens <hca@...ux.ibm.com>,
Christian Borntraeger <borntraeger@...ux.ibm.com>,
Claudio Imbrenda <imbrenda@...ux.ibm.com>,
Alexander Gordeev <agordeev@...ux.ibm.com>,
Gerald Schaefer <gerald.schaefer@...ux.ibm.com>,
Vasily Gorbik <gor@...ux.ibm.com>,
Jann Horn <jannh@...gle.com>,
Vishal Moola <vishal.moola@...il.com>,
Vlastimil Babka <vbabka@...e.cz>, Zi Yan <ziy@...dia.com>,
linux-arm-kernel@...ts.infradead.org, sparclinux@...r.kernel.org,
linuxppc-dev@...ts.ozlabs.org, linux-s390@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH v3 04/13] powerpc: assert_pte_locked() use
pte_offset_map_nolock()
Hugh Dickins <hughd@...gle.com> writes:
> Instead of pte_lockptr(), use the recently added pte_offset_map_nolock()
> in assert_pte_locked(). BUG if pte_offset_map_nolock() fails: this is
> stricter than the previous implementation, which skipped when pmd_none()
> (with a comment on khugepaged collapse transitions): but wouldn't we want
> to know, if an assert_pte_locked() caller can be racing such transitions?
>
The reason we had that pmd_none check there was to handle khugpaged. In
case of khugepaged we do pmdp_collapse_flush and then do a ptep_clear.
ppc64 had the assert_pte_locked check inside that ptep_clear.
_pmd = pmdp_collapse_flush(vma, address, pmd);
..
ptep_clear()
-> asset_ptep_locked()
---> pmd_none
-----> BUG
The problem is how assert_pte_locked() verify whether we are holding
ptl. It does that by walking the page table again and in this specific
case by the time we call the function we already had cleared pmd .
>
> This mod might cause new crashes: which either expose my ignorance, or
> indicate issues to be fixed, or limit the usage of assert_pte_locked().
>
> Signed-off-by: Hugh Dickins <hughd@...gle.com>
> ---
> arch/powerpc/mm/pgtable.c | 16 ++++++----------
> 1 file changed, 6 insertions(+), 10 deletions(-)
>
> diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c
> index cb2dcdb18f8e..16b061af86d7 100644
> --- a/arch/powerpc/mm/pgtable.c
> +++ b/arch/powerpc/mm/pgtable.c
> @@ -311,6 +311,8 @@ void assert_pte_locked(struct mm_struct *mm, unsigned long addr)
> p4d_t *p4d;
> pud_t *pud;
> pmd_t *pmd;
> + pte_t *pte;
> + spinlock_t *ptl;
>
> if (mm == &init_mm)
> return;
> @@ -321,16 +323,10 @@ void assert_pte_locked(struct mm_struct *mm, unsigned long addr)
> pud = pud_offset(p4d, addr);
> BUG_ON(pud_none(*pud));
> pmd = pmd_offset(pud, addr);
> - /*
> - * khugepaged to collapse normal pages to hugepage, first set
> - * pmd to none to force page fault/gup to take mmap_lock. After
> - * pmd is set to none, we do a pte_clear which does this assertion
> - * so if we find pmd none, return.
> - */
> - if (pmd_none(*pmd))
> - return;
> - BUG_ON(!pmd_present(*pmd));
> - assert_spin_locked(pte_lockptr(mm, pmd));
> + pte = pte_offset_map_nolock(mm, pmd, addr, &ptl);
> + BUG_ON(!pte);
> + assert_spin_locked(ptl);
> + pte_unmap(pte);
> }
> #endif /* CONFIG_DEBUG_VM */
>
> --
> 2.35.3
Powered by blists - more mailing lists