[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d3e4cc32-ce07-4ce2-789a-3c1df093c270@oracle.com>
Date: Wed, 28 Oct 2020 16:42:13 -0700
From: Mike Kravetz <mike.kravetz@...cle.com>
To: Muchun Song <songmuchun@...edance.com>
Cc: Jonathan Corbet <corbet@....net>,
Thomas Gleixner <tglx@...utronix.de>, mingo@...hat.com,
bp@...en8.de, x86@...nel.org, hpa@...or.com,
dave.hansen@...ux.intel.com, luto@...nel.org,
Peter Zijlstra <peterz@...radead.org>, viro@...iv.linux.org.uk,
Andrew Morton <akpm@...ux-foundation.org>, paulmck@...nel.org,
mchehab+huawei@...nel.org, pawan.kumar.gupta@...ux.intel.com,
Randy Dunlap <rdunlap@...radead.org>, oneukum@...e.com,
anshuman.khandual@....com, jroedel@...e.de,
Mina Almasry <almasrymina@...gle.com>,
David Rientjes <rientjes@...gle.com>,
Matthew Wilcox <willy@...radead.org>,
Xiongchun duan <duanxiongchun@...edance.com>,
linux-doc@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
Linux Memory Management List <linux-mm@...ck.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: [External] Re: [PATCH v2 05/19] mm/hugetlb: Introduce pgtable
allocation/freeing helpers
On 10/28/20 12:26 AM, Muchun Song wrote:
> On Wed, Oct 28, 2020 at 8:33 AM Mike Kravetz <mike.kravetz@...cle.com> wrote:
>> On 10/26/20 7:51 AM, Muchun Song wrote:
>>
>> I see the following routines follow the pattern for vmemmap manipulation
>> in dax.
>
> Did you mean move those functions to mm/sparse-vmemmap.c?
No. Sorry, that was mostly a not to myself.
>>> +static void vmemmap_pgtable_deposit(struct page *page, pte_t *pte_p)
>>> +{
>>> + pgtable_t pgtable = virt_to_page(pte_p);
>>> +
>>> + /* FIFO */
>>> + if (!page_huge_pte(page))
>>> + INIT_LIST_HEAD(&pgtable->lru);
>>> + else
>>> + list_add(&pgtable->lru, &page_huge_pte(page)->lru);
>>> + page_huge_pte(page) = pgtable;
>>> +}
>>> +
>>> +static pte_t *vmemmap_pgtable_withdraw(struct page *page)
>>> +{
>>> + pgtable_t pgtable;
>>> +
>>> + /* FIFO */
>>> + pgtable = page_huge_pte(page);
>>> + if (unlikely(!pgtable))
>>> + return NULL;
>>> + page_huge_pte(page) = list_first_entry_or_null(&pgtable->lru,
>>> + struct page, lru);
>>> + if (page_huge_pte(page))
>>> + list_del(&pgtable->lru);
>>> + return page_to_virt(pgtable);
>>> +}
>>> +
...
>>> @@ -1783,6 +1892,14 @@ static struct page *alloc_fresh_huge_page(struct hstate *h,
>>> if (!page)
>>> return NULL;
>>>
>>> + if (vmemmap_pgtable_prealloc(h, page)) {
>>> + if (hstate_is_gigantic(h))
>>> + free_gigantic_page(page, huge_page_order(h));
>>> + else
>>> + put_page(page);
>>> + return NULL;
>>> + }
>>> +
>>
>> It seems a bit strange that we will fail a huge page allocation if
>> vmemmap_pgtable_prealloc fails. Not sure, but it almost seems like we shold
>> allow the allocation and log a warning? It is somewhat unfortunate that
>> we need to allocate a page to free pages.
>
> Yeah, it seems unfortunate. But if we allocate success, we can free some
> vmemmap pages later. Like a compromise :) . If we can successfully allocate
> a huge page, I also prefer to be able to successfully allocate another one page.
> If we allow the allocation when vmemmap_pgtable_prealloc fails, we also
> need to mark this page that vmemmap has not been released. Seems
> increase complexity.
Yes, I think it is better to leave code as it is and avoid complexity.
--
Mike Kravetz
Powered by blists - more mailing lists