[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7df13a2d-1c95-c753-da64-4a47e8872fd9@google.com>
Date: Mon, 10 Jul 2023 22:21:16 -0700 (PDT)
From: Hugh Dickins <hughd@...gle.com>
To: Zi Yan <ziy@...dia.com>
cc: Hugh Dickins <hughd@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Mike Kravetz <mike.kravetz@...cle.com>,
Mike Rapoport <rppt@...nel.org>,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
Matthew Wilcox <willy@...radead.org>,
David Hildenbrand <david@...hat.com>,
Suren Baghdasaryan <surenb@...gle.com>,
Qi Zheng <zhengqi.arch@...edance.com>,
Yang Shi <shy828301@...il.com>,
Mel Gorman <mgorman@...hsingularity.net>,
Peter Xu <peterx@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Will Deacon <will@...nel.org>, Yu Zhao <yuzhao@...gle.com>,
Alistair Popple <apopple@...dia.com>,
Ralph Campbell <rcampbell@...dia.com>,
Ira Weiny <ira.weiny@...el.com>,
Steven Price <steven.price@....com>,
SeongJae Park <sj@...nel.org>,
Lorenzo Stoakes <lstoakes@...il.com>,
Huang Ying <ying.huang@...el.com>,
Naoya Horiguchi <naoya.horiguchi@....com>,
Christophe Leroy <christophe.leroy@...roup.eu>,
Zack Rusin <zackr@...are.com>, Jason Gunthorpe <jgg@...pe.ca>,
Axel Rasmussen <axelrasmussen@...gle.com>,
Anshuman Khandual <anshuman.khandual@....com>,
Pasha Tatashin <pasha.tatashin@...een.com>,
Miaohe Lin <linmiaohe@...wei.com>,
Minchan Kim <minchan@...nel.org>,
Christoph Hellwig <hch@...radead.org>,
Song Liu <song@...nel.org>,
Thomas Hellstrom <thomas.hellstrom@...ux.intel.com>,
Ryan Roberts <ryan.roberts@....com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH v2 05/32] mm/filemap: allow pte_offset_map_lock() to
fail
On Mon, 10 Jul 2023, Zi Yan wrote:
> On 8 Jun 2023, at 21:11, Hugh Dickins wrote:
>
> > filemap_map_pages() allow pte_offset_map_lock() to fail; and remove the
> > pmd_devmap_trans_unstable() check from filemap_map_pmd(), which can safely
> > return to filemap_map_pages() and let pte_offset_map_lock() discover that.
> >
> > Signed-off-by: Hugh Dickins <hughd@...gle.com>
> > ---
> > mm/filemap.c | 12 +++++-------
> > 1 file changed, 5 insertions(+), 7 deletions(-)
> >
> > diff --git a/mm/filemap.c b/mm/filemap.c
> > index 28b42ee848a4..9e129ad43e0d 100644
> > --- a/mm/filemap.c
> > +++ b/mm/filemap.c
> > @@ -3408,13 +3408,6 @@ static bool filemap_map_pmd(struct vm_fault *vmf, struct folio *folio,
> > if (pmd_none(*vmf->pmd))
> > pmd_install(mm, vmf->pmd, &vmf->prealloc_pte);
> >
> > - /* See comment in handle_pte_fault() */
> > - if (pmd_devmap_trans_unstable(vmf->pmd)) {
> > - folio_unlock(folio);
> > - folio_put(folio);
> > - return true;
> > - }
> > -
>
> There is a pmd_trans_huge() check at the beginning, should it be removed
> as well? Since pte_offset_map_lock() is also able to detect it.
It probably could be removed: but mostly I avoided such cleanups,
in the hope that the patches could be more easily reviewed as safe.
But I was eager to delete that obscure pmd_devmap_trans_unstable().
The whole strategy of dealing with the pmd_trans_huge()-like cases first,
and only finally arriving at the pte_offset_map_lock() when other cases
have been excluded, could be reversed in *many* places. It had to be that
way before, because pte_offset_map_lock() could only cope with a page
table; but now we could reverse them to do the pte_offset_map_lock()
first, and only try the other cases when it fails.
That would in theory be more efficient; but whether measurably more
efficient I doubt. And very easy to introduce errors on the way:
my enthusiasm for such cleanups is low! But maybe there's a few
places where the rearrangement would be worthwhile.
>
> > return false;
> > }
> >
> > @@ -3501,6 +3494,11 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
> >
> > addr = vma->vm_start + ((start_pgoff - vma->vm_pgoff) << PAGE_SHIFT);
> > vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, addr, &vmf->ptl);
> > + if (!vmf->pte) {
> > + folio_unlock(folio);
> > + folio_put(folio);
> > + goto out;
> > + }
> > do {
> > again:
> > page = folio_file_page(folio, xas.xa_index);
> > --
> > 2.35.3
>
> These two changes affect the ret value. Before, pmd_devmap_trans_unstable() == true
> made ret = VM_FAULT_NPAGE, but now ret is the default 0 value. So ret should be set
> to VM_FAULT_NPAGE before goto out in the second hunk?
Qi Zheng raised a similar question on the original posting, I answered
https://lore.kernel.org/linux-mm/fb9a9d57-dbd7-6a6e-d1cb-8dcd64c829a6@google.com/
It's a rare case to fault here, then find pmd_devmap(*pmd), and it really
doesn't matter whether we return VM_FAULT_NOPAGE or 0 for it - maybe I've
left it inconsistent between THP and devmap, but it doesn't really matter.
I haven't checked Matthew's v5 "new page table range API" posted today,
but I expect this all looks different here anyway.
Thanks a lot for checking these: they are now in 6.5-rc1, so if you find
something that needs fixing, all the more important that we do fix it.
Hugh
Powered by blists - more mailing lists