[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240307201655.GF9179@nvidia.com>
Date: Thu, 7 Mar 2024 16:16:55 -0400
From: Jason Gunthorpe <jgg@...dia.com>
To: peterx@...hat.com
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
linuxppc-dev@...ts.ozlabs.org,
Andrew Morton <akpm@...ux-foundation.org>,
Muchun Song <muchun.song@...ux.dev>,
Matthew Wilcox <willy@...radead.org>,
Mike Rapoport <rppt@...nel.org>,
Christophe Leroy <christophe.leroy@...roup.eu>, x86@...nel.org,
sparclinux@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
Naoya Horiguchi <naoya.horiguchi@....com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
Dave Hansen <dave.hansen@...ux.intel.com>
Subject: Re: [PATCH RFC 04/13] mm/x86: Change pXd_huge() behavior to exclude
swap entries
On Wed, Mar 06, 2024 at 06:41:38PM +0800, peterx@...hat.com wrote:
> From: Peter Xu <peterx@...hat.com>
>
> This patch partly reverts below commits:
>
> 3a194f3f8ad0 ("mm/hugetlb: make pud_huge() and follow_huge_pud() aware of non-present pud entry")
> cbef8478bee5 ("mm/hugetlb: pmd_huge() returns true for non-present hugepage")
>
> Right now, pXd_huge() definition across kernel is unclear. We have two
> groups that think differently on swap entries:
>
> - x86/sparc: Allow pXd_huge() to accept swap entries
> - all the rest: Doesn't allow pXd_huge() to accept swap entries
>
> This is so confusing. Since the sparc helpers seem to be added in 2016,
> which is after x86's (2015), so sparc could have followed a trend. x86
> proposed such swap handling in 2015 to resolve hugetlb swap entries hit in
> GUP, but now GUP guards swap entries with !pXd_present() in all layers so
> we should be safe.
>
> We should define this API properly, one way or another, rather than keep
> them defined differently across archs.
>
> Gut feeling tells me that pXd_huge() shouldn't include swap entries, and it
> turns out that I am not the only one thinking so, the question was raised
> when the current pmd_huge() for x86 was proposed by Ville Syrjälä:
>
> https://lore.kernel.org/all/Y2WQ7I4LXh8iUIRd@intel.com/
>
> I might also be missing something obvious, but why is it even necessary
> to treat PRESENT==0+PSE==0 as a huge entry?
>
> It is also questioned when Jason Gunthorpe reviewed the other patchset on
> swap entry handlings:
>
> https://lore.kernel.org/all/20240221125753.GQ13330@nvidia.com/
>
> Revert its meaning back to original. It shouldn't have any functional
> change as we should be ready with guards on !pXd_present() explicitly
> everywhere.
>
> Note that I also dropped the "#if CONFIG_PGTABLE_LEVELS > 2", it was there
> probably because it was breaking things when 3a194f3f8ad0 was proposed,
> according to the report here:
>
> https://lore.kernel.org/all/Y2LYXItKQyaJTv8j@intel.com/
>
> Now we shouldn't need that.
>
> Instead of reverting to _PAGE_PSE raw check, leverage pXd_leaf().
>
> Cc: Naoya Horiguchi <naoya.horiguchi@....com>
> Cc: Thomas Gleixner <tglx@...utronix.de>
> Cc: Ingo Molnar <mingo@...hat.com>
> Cc: Borislav Petkov <bp@...en8.de>
> Cc: Dave Hansen <dave.hansen@...ux.intel.com>
> Cc: x86@...nel.org
> Signed-off-by: Peter Xu <peterx@...hat.com>
> ---
> arch/x86/mm/hugetlbpage.c | 18 ++++--------------
> 1 file changed, 4 insertions(+), 14 deletions(-)
I think this is the right thing to do, callers should be more directly
sensitive to swap entries not back into it indirectly from a helper
like this.
Jason
Powered by blists - more mailing lists