[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d0abf3db-0726-81b9-5a44-36cb1736a155@nvidia.com>
Date: Wed, 18 May 2022 00:12:18 -0700
From: John Hubbard <jhubbard@...dia.com>
To: Mike Kravetz <mike.kravetz@...cle.com>,
Minchan Kim <minchan@...nel.org>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
syzbot <syzbot+acf65ca584991f3cc447@...kaller.appspotmail.com>,
linux-kernel@...r.kernel.org, linux-mm@...ck.org,
llvm@...ts.linux.dev, nathan@...nel.org, ndesaulniers@...gle.com,
syzkaller-bugs@...glegroups.com, trix@...hat.com,
Matthew Wilcox <willy@...radead.org>,
Stephen Rothwell <sfr@...b.auug.org.au>,
David Hildenbrand <david@...hat.com>
Subject: Re: [syzbot] WARNING in follow_hugetlb_page
On 5/16/22 20:37, Mike Kravetz wrote:
> Later, that code was modified (for performance reasons) in commit 0fa5bc4023c18 to do a single try_grab_compound_page() instead of multiple
> calls to try_grab_page(). At the time it was not noticed that
> try_grab_compound_page had a check for 'pinnable' when try_grab_page did
> not. So, I think this commit actually changed the behavior.
This makes me unhappy with the try_grab_page() situation: the name
doesn't give you any hint that it now has extra policy in there.
Furthermore, the policy is unaware of all of the call sites and
is already getting it wrong.
After looking through the call sites, I'm leaning slightly toward pulling
is_pinnable_page() out, and letting the call sites decide if they want
or need to apply that policy.
Just because try_grab_folio() is a choke point, does not mean that it
is the right place for the is_pinnable_page() checks. And while it's
nice to avoid scattering is_pinnable_page() all over the place, it's
more important to *not* have it when it is not correct.
There is a counter-argument, though, which goes something like,
"try_grab_folio(FOLL_PIN) is never ever correct if the page is not
pinnable". But that's not as clear cut as one might think. See below.
>
>>
>> try_grab_folio()
>> /*
>> * Can't do FOLL_LONGTERM + FOLL_PIN gup fast path if not in a
>> * right zone, so fail and let the caller fall back to the slow
>> * path.
>> */
>> if (unlikely((flags & FOLL_LONGTERM) &&
>> !is_pinnable_page(page))) /* which we just changed */
>> return NULL;
>>
>> ...and now I'm starting to think that this warning might fire even with
>> the corrected check for MIGRATE_CMA || MIGRATE_ISOLATE. Because
>> try_grab_folio() didn't always have this early exit and it is starting
>> to look wrong.
>>
>> Simply attempting to pin a non-pinnable huge page would hit this
>> warning. Adding additional reasons that a page is not pinnable (which
>> the patch does) could make this more likely to fire.
>
> Yes, that is correct. One could easily allocate a hugetlb page from
> CMA and trigger this warning.
>
Thanks for confirming that!
>>
>> I need to look at this a little more closely, it is making me wonder
>> whether the is_pinnable_page() check is a problem in this path. The
>> comment in try_grab_folio() indicates that the early return is a hack
>> (it assumes that the caller is in the gup fast path), and maybe the hack
>> is just wrong here--I think we're actually on the slow gup path. Not
>> good.
>>
>> Mike, any thoughts here?
>>
>
> Do you know why try_grab_compound_page(now try_grab_folio) checks for
> pinnable when try_grab_page does not?
I think it's just an oversight. The CMA and migrate-out logic was sort of
retrofitted and I think it's just incomplete, yet.
>
> Then I guess the next question is 'Should we allow pinning of hugetlb pages
> in these areas?'. My first thought would be no. But, recall it was 'allowed'
> until that commit which changed try_grab_page to try_grab_compound_page.
> In the 'common' case of compaction, we do not attempt to migrate/move hugetlb
> pages (last time I looked), so long term pinning should not be an issue.
> However, for things like memory offline or alloc_contig_range() we want to
> migrate hugetlb pages, so this would be an issue there.
>
> At a minimum, I think the warning should go.
Agreed. That, and either a) bring try_grab_folio() and try_grab_page() into
the same behavior with respect to checking for pinnable, or b) lifting
is_pinnable_page() out of try_grab_folio() and letting the callers decide, as
mentioned above.
Your point above makes it seem like the flexibility of (b) might be better...
thanks,
--
John Hubbard
NVIDIA
Powered by blists - more mailing lists