[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <73832edd-13ec-8032-d8d6-4afc53297fdb@redhat.com>
Date: Wed, 9 Dec 2020 10:32:55 +0100
From: David Hildenbrand <david@...hat.com>
To: Muchun Song <songmuchun@...edance.com>
Cc: Jonathan Corbet <corbet@....net>,
Mike Kravetz <mike.kravetz@...cle.com>,
Thomas Gleixner <tglx@...utronix.de>, mingo@...hat.com,
bp@...en8.de, x86@...nel.org, hpa@...or.com,
dave.hansen@...ux.intel.com, luto@...nel.org,
Peter Zijlstra <peterz@...radead.org>, viro@...iv.linux.org.uk,
Andrew Morton <akpm@...ux-foundation.org>, paulmck@...nel.org,
mchehab+huawei@...nel.org, pawan.kumar.gupta@...ux.intel.com,
Randy Dunlap <rdunlap@...radead.org>, oneukum@...e.com,
anshuman.khandual@....com, jroedel@...e.de,
Mina Almasry <almasrymina@...gle.com>,
David Rientjes <rientjes@...gle.com>,
Matthew Wilcox <willy@...radead.org>,
Oscar Salvador <osalvador@...e.de>,
Michal Hocko <mhocko@...e.com>,
"Song Bao Hua (Barry Song)" <song.bao.hua@...ilicon.com>,
Xiongchun duan <duanxiongchun@...edance.com>,
linux-doc@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
Linux Memory Management List <linux-mm@...ck.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: [External] Re: [PATCH v7 05/15] mm/bootmem_info: Introduce
{free,prepare}_vmemmap_page()
On 09.12.20 10:25, Muchun Song wrote:
> On Wed, Dec 9, 2020 at 4:50 PM David Hildenbrand <david@...hat.com> wrote:
>>
>> On 09.12.20 08:36, Muchun Song wrote:
>>> On Mon, Dec 7, 2020 at 8:39 PM David Hildenbrand <david@...hat.com> wrote:
>>>>
>>>> On 30.11.20 16:18, Muchun Song wrote:
>>>>> In the later patch, we can use the free_vmemmap_page() to free the
>>>>> unused vmemmap pages and initialize a page for vmemmap page using
>>>>> via prepare_vmemmap_page().
>>>>>
>>>>> Signed-off-by: Muchun Song <songmuchun@...edance.com>
>>>>> ---
>>>>> include/linux/bootmem_info.h | 24 ++++++++++++++++++++++++
>>>>> 1 file changed, 24 insertions(+)
>>>>>
>>>>> diff --git a/include/linux/bootmem_info.h b/include/linux/bootmem_info.h
>>>>> index 4ed6dee1adc9..239e3cc8f86c 100644
>>>>> --- a/include/linux/bootmem_info.h
>>>>> +++ b/include/linux/bootmem_info.h
>>>>> @@ -3,6 +3,7 @@
>>>>> #define __LINUX_BOOTMEM_INFO_H
>>>>>
>>>>> #include <linux/mmzone.h>
>>>>> +#include <linux/mm.h>
>>>>>
>>>>> /*
>>>>> * Types for free bootmem stored in page->lru.next. These have to be in
>>>>> @@ -22,6 +23,29 @@ void __init register_page_bootmem_info_node(struct pglist_data *pgdat);
>>>>> void get_page_bootmem(unsigned long info, struct page *page,
>>>>> unsigned long type);
>>>>> void put_page_bootmem(struct page *page);
>>>>> +
>>>>> +static inline void free_vmemmap_page(struct page *page)
>>>>> +{
>>>>> + VM_WARN_ON(!PageReserved(page) || page_ref_count(page) != 2);
>>>>> +
>>>>> + /* bootmem page has reserved flag in the reserve_bootmem_region */
>>>>> + if (PageReserved(page)) {
>>>>> + unsigned long magic = (unsigned long)page->freelist;
>>>>> +
>>>>> + if (magic == SECTION_INFO || magic == MIX_SECTION_INFO)
>>>>> + put_page_bootmem(page);
>>>>> + else
>>>>> + WARN_ON(1);
>>>>> + }
>>>>> +}
>>>>> +
>>>>> +static inline void prepare_vmemmap_page(struct page *page)
>>>>> +{
>>>>> + unsigned long section_nr = pfn_to_section_nr(page_to_pfn(page));
>>>>> +
>>>>> + get_page_bootmem(section_nr, page, SECTION_INFO);
>>>>> + mark_page_reserved(page);
>>>>> +}
>>>>
>>>> Can you clarify in the description when exactly these functions are
>>>> called and on which type of pages?
>>>>
>>>> Would indicating "bootmem" in the function names make it clearer what we
>>>> are dealing with?
>>>>
>>>> E.g., any memory allocated via the memblock allocator and not via the
>>>> buddy will be makred reserved already in the memmap. It's unclear to me
>>>> why we need the mark_page_reserved() here - can you enlighten me? :)
>>>
>>> Sorry for ignoring this question. Because the vmemmap pages are allocated
>>> from the bootmem allocator which is marked as PG_reserved. For those bootmem
>>> pages, we should call put_page_bootmem for free. You can see that we
>>> clear the PG_reserved in the put_page_bootmem. In order to be consistent,
>>> the prepare_vmemmap_page also marks the page as PG_reserved.
>>
>> I don't think that really makes sense.
>>
>> After put_page_bootmem() put the last reference, it clears PG_reserved
>> and hands the page over to the buddy via free_reserved_page(). From that
>> point on, further get_page_bootmem() would be completely wrong and
>> dangerous.
>>
>> Both, put_page_bootmem() and get_page_bootmem() rely on the fact that
>> they are dealing with memblock allcoations - marked via PG_reserved. If
>> prepare_vmemmap_page() would be called on something that's *not* coming
>> from the memblock allocator, it would be completely broken - or am I
>> missing something?
>>
>> AFAIKT, there should rather be a BUG_ON(!PageReserved(page)) in
>> prepare_vmemmap_page() - or proper handling to deal with !memblock
>> allocations.
>>
>
> I want to allocate some pages as the vmemmap when
> we free a HugeTLB page to the buddy allocator. So I use
> the prepare_vmemmap_page() to initialize the page (which
> allocated from buddy allocator) and make it as the vmemmap
> of the freed HugeTLB page.
>
> Any suggestions to deal with this case?
If you obtained pages via the buddy, there shouldn't be anything special
to handle, no? What speaks against
prepare_vmemmap_page():
if (!PageReserved(page))
return;
put_page_bootmem():
if (!PageReserved(page))
__free_page();
Or if we care about multiple references, get_page() and put_page().
>
> I have a solution to address this. When the pages allocated
> from the buddy as vmemmap pages, we do not call
> prepare_vmemmap_page().
>
> When we free some vmemmap pages of a HugeTLB
> page, if the PG_reserved of the vmemmap page is set,
> we call free_vmemmap_page() to free it to buddy,
> otherwise call free_page(). What is your opinion?
That would also work. Then, please include "bootmem" as part of the
function name. If you plan on using my suggestion, you can drop
"bootmem" from the name as it works for both types of pages.
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists