[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <44efc25e-525b-9e51-60e4-da20deb25ded@oracle.com>
Date: Thu, 19 Nov 2020 15:21:32 -0800
From: Mike Kravetz <mike.kravetz@...cle.com>
To: Muchun Song <songmuchun@...edance.com>,
Oscar Salvador <osalvador@...e.de>
Cc: Jonathan Corbet <corbet@....net>,
Thomas Gleixner <tglx@...utronix.de>, mingo@...hat.com,
bp@...en8.de, x86@...nel.org, hpa@...or.com,
dave.hansen@...ux.intel.com, luto@...nel.org,
Peter Zijlstra <peterz@...radead.org>, viro@...iv.linux.org.uk,
Andrew Morton <akpm@...ux-foundation.org>, paulmck@...nel.org,
mchehab+huawei@...nel.org, pawan.kumar.gupta@...ux.intel.com,
Randy Dunlap <rdunlap@...radead.org>, oneukum@...e.com,
anshuman.khandual@....com, jroedel@...e.de,
Mina Almasry <almasrymina@...gle.com>,
David Rientjes <rientjes@...gle.com>,
Matthew Wilcox <willy@...radead.org>,
Michal Hocko <mhocko@...e.com>,
Xiongchun duan <duanxiongchun@...edance.com>,
linux-doc@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
Linux Memory Management List <linux-mm@...ck.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: [External] Re: [PATCH v4 05/21] mm/hugetlb: Introduce pgtable
allocation/freeing helpers
On 11/18/20 10:17 PM, Muchun Song wrote:
> On Tue, Nov 17, 2020 at 11:06 PM Oscar Salvador <osalvador@...e.de> wrote:
>>
>> On Fri, Nov 13, 2020 at 06:59:36PM +0800, Muchun Song wrote:
>>> +#define page_huge_pte(page) ((page)->pmd_huge_pte)
>>
>> Seems you do not need this one anymore.
>>
>>> +void vmemmap_pgtable_free(struct page *page)
>>> +{
>>> + struct page *pte_page, *t_page;
>>> +
>>> + list_for_each_entry_safe(pte_page, t_page, &page->lru, lru) {
>>> + list_del(&pte_page->lru);
>>> + pte_free_kernel(&init_mm, page_to_virt(pte_page));
>>> + }
>>> +}
>>> +
>>> +int vmemmap_pgtable_prealloc(struct hstate *h, struct page *page)
>>> +{
>>> + unsigned int nr = pgtable_pages_to_prealloc_per_hpage(h);
>>> +
>>> + /* Store preallocated pages on huge page lru list */
>>> + INIT_LIST_HEAD(&page->lru);
>>> +
>>> + while (nr--) {
>>> + pte_t *pte_p;
>>> +
>>> + pte_p = pte_alloc_one_kernel(&init_mm);
>>> + if (!pte_p)
>>> + goto out;
>>> + list_add(&virt_to_page(pte_p)->lru, &page->lru);
>>> + }
>>
>> Definetely this looks better and easier to handle.
>> Btw, did you explore Matthew's hint about instead of allocating a new page,
>> using one of the ones you are going to free to store the ptes?
>> I am not sure whether it is feasible at all though.
>
> Hi Oscar and Matthew,
>
> I have started an investigation about this. Finally, I think that it
> may not be feasible. If we use a vmemmap page frame as a
> page table when we split the PMD table firstly, in this stage,
> we need to set 512 pte entry to the vmemmap page frame. If
> someone reads the tail struct page struct of the HugeTLB,
> it can get the arbitrary value (I am not sure it actually exists,
> maybe the memory compaction module can do this). So on
> the safe side, I think that allocating a new page is a good
> choice.
Thanks for looking into this.
If I understand correctly, the issue is that you need the pte page to set
up the new mappings. In your current code, this is done before removing
the pages of struct pages. This keeps everything 'consistent' as things
are remapped.
If you want to use one of the 'pages of struct pages' for the new pte
page, then there will be a period of time when things are inconsistent.
Before setting up the mapping, some code could potentially access that
pages of struct pages.
I tend to agree that allocating allocating a new page is the safest thing
to do here. Or, perhaps someone can think of a way make this safe.
--
Mike Kravetz
Powered by blists - more mailing lists