[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b9dcdeb8-6994-55f9-1780-794e17589129@oracle.com>
Date: Wed, 6 Jan 2021 13:07:40 -0800
From: Mike Kravetz <mike.kravetz@...cle.com>
To: Michal Hocko <mhocko@...e.com>
Cc: Muchun Song <songmuchun@...edance.com>, akpm@...ux-foundation.org,
n-horiguchi@...jp.nec.com, ak@...ux.intel.com, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 2/6] mm: hugetlbfs: fix cannot migrate the fallocated
HugeTLB page
On 1/6/21 12:02 PM, Michal Hocko wrote:
> On Wed 06-01-21 11:30:25, Mike Kravetz wrote:
>> On 1/6/21 8:35 AM, Michal Hocko wrote:
>>> On Wed 06-01-21 16:47:35, Muchun Song wrote:
>>>> Because we only can isolate a active page via isolate_huge_page()
>>>> and hugetlbfs_fallocate() forget to mark it as active, we cannot
>>>> isolate and migrate those pages.
>>>
>>> I've little bit hard time to understand this initially and had to dive
>>> into the code to make sense of it. I would consider the following
>>> wording easier to grasp. Feel free to reuse if you like.
>>> "
>>> If a new hugetlb page is allocated during fallocate it will not be
>>> marked as active (set_page_huge_active) which will result in a later
>>> isolate_huge_page failure when the page migration code would like to
>>> move that page. Such a failure would be unexpected and wrong.
>>> "
>>>
>>> Now to the fix. I believe that this patch shows that the
>>> set_page_huge_active is just too subtle. Is there any reason why we
>>> cannot make all freshly allocated huge pages active by default?
>>
>> I looked into that yesterday. The primary issue is in page fault code,
>> hugetlb_no_page is an example. If page_huge_active is set, then it can
>> be isolated for migration. So, migration could race with the page fault
>> and the page could be migrated before being added to the page table of
>> the faulting task. This was an issue when hugetlb_no_page set_page_huge_active
>> right after allocating and clearing the huge page. Commit cb6acd01e2e4
>> moved the set_page_huge_active after adding the page to the page table
>> to address this issue.
>
> Thanks for the clarification. I was not aware of this subtlety. The
> existing comment is not helping much TBH. I am still digesting the
> suggested race. The page is new and exclusive and not visible via page
> tables yet, so the only source of the migration would be pfn based
> (hotplug, poisoning), right?
That is correct.
> Btw. s@..._page_huge_active@..._page_huge_migrateable@ would help
> readability IMHO. With a comment explaining that this _has_ to be called
> after the page is fully initialized.
Agree, I will add that as a future enhancement.
--
Mike Kravetz
Powered by blists - more mailing lists