[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMZfGtVnS=_m4fpGBfDpOpdgzP02QCteUQn-gGiLADWfGiVJ=A@mail.gmail.com>
Date: Mon, 21 Dec 2020 19:25:15 +0800
From: Muchun Song <songmuchun@...edance.com>
To: Oscar Salvador <osalvador@...e.de>
Cc: Jonathan Corbet <corbet@....net>,
Mike Kravetz <mike.kravetz@...cle.com>,
Thomas Gleixner <tglx@...utronix.de>, mingo@...hat.com,
bp@...en8.de, x86@...nel.org, hpa@...or.com,
dave.hansen@...ux.intel.com, luto@...nel.org,
Peter Zijlstra <peterz@...radead.org>, viro@...iv.linux.org.uk,
Andrew Morton <akpm@...ux-foundation.org>, paulmck@...nel.org,
mchehab+huawei@...nel.org, pawan.kumar.gupta@...ux.intel.com,
Randy Dunlap <rdunlap@...radead.org>, oneukum@...e.com,
anshuman.khandual@....com, jroedel@...e.de,
Mina Almasry <almasrymina@...gle.com>,
David Rientjes <rientjes@...gle.com>,
Matthew Wilcox <willy@...radead.org>,
Michal Hocko <mhocko@...e.com>,
"Song Bao Hua (Barry Song)" <song.bao.hua@...ilicon.com>,
David Hildenbrand <david@...hat.com>, naoya.horiguchi@....com,
Xiongchun duan <duanxiongchun@...edance.com>,
linux-doc@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
Linux Memory Management List <linux-mm@...ck.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: [External] Re: [PATCH v10 03/11] mm/hugetlb: Free the vmemmap
pages associated with each HugeTLB page
On Mon, Dec 21, 2020 at 5:11 PM Oscar Salvador <osalvador@...e.de> wrote:
>
> On Thu, Dec 17, 2020 at 08:12:55PM +0800, Muchun Song wrote:
> > +static inline void free_bootmem_page(struct page *page)
> > +{
> > + unsigned long magic = (unsigned long)page->freelist;
> > +
> > + /*
> > + * The reserve_bootmem_region sets the reserved flag on bootmem
> > + * pages.
> > + */
> > + VM_WARN_ON(page_ref_count(page) != 2);
> > +
> > + if (magic == SECTION_INFO || magic == MIX_SECTION_INFO)
> > + put_page_bootmem(page);
> > + else
> > + VM_WARN_ON(1);
>
> Ideally, I think we want to see what how the page looks since its state
> is not what we expected, so maybe join both conditions and use dump_page().
Agree. Will do. Thanks.
>
> > + * By removing redundant page structs for HugeTLB pages, memory can returned to
> ^^ be
Thanks.
> > + * the buddy allocator for other uses.
>
> [...]
>
> > +void free_huge_page_vmemmap(struct hstate *h, struct page *head)
> > +{
> > + unsigned long vmemmap_addr = (unsigned long)head;
> > +
> > + if (!free_vmemmap_pages_per_hpage(h))
> > + return;
> > +
> > + vmemmap_remap_free(vmemmap_addr + RESERVE_VMEMMAP_SIZE,
> > + free_vmemmap_pages_size_per_hpage(h));
>
> I am not sure what others think, but I would like to see vmemmap_remap_free taking
> three arguments: start, end, and reuse addr, e.g:
>
> void free_huge_page_vmemmap(struct hstate *h, struct page *head)
> {
> unsigned long vmemmap_addr = (unsigned long)head;
> unsigned long vmemmap_end, vmemmap_reuse;
>
> if (!free_vmemmap_pages_per_hpage(h))
> return;
>
> vmemmap_addr += RESERVE_MEMMAP_SIZE;
> vmemmap_end = vmemmap_addr + free_vmemmap_pages_size_per_hpage(h);
> vmemmap_reuse = vmemmap_addr - PAGE_SIZE;
>
> vmemmap_remap_free(vmemmap_addr, vmemmap_end, vmemmap_reuse);
> }
>
> The reason for me to do this is to let the callers of vmemmap_remap_free decide
> __what__ they want to remap.
>
> More on this below.
>
>
> > +static void vmemmap_pte_range(pmd_t *pmd, unsigned long addr,
> > + unsigned long end,
> > + struct vmemmap_remap_walk *walk)
> > +{
> > + pte_t *pte;
> > +
> > + pte = pte_offset_kernel(pmd, addr);
> > +
> > + if (walk->reuse_addr == addr) {
> > + BUG_ON(pte_none(*pte));
> > + walk->reuse_page = pte_page(*pte++);
> > + addr += PAGE_SIZE;
> > + }
>
> Although it is quite obvious, a brief comment here pointing out what are we
> doing and that this is meant to be set only once would be nice.
OK. Will do.
>
>
> > +static void vmemmap_remap_range(unsigned long start, unsigned long end,
> > + struct vmemmap_remap_walk *walk)
> > +{
> > + unsigned long addr = start - PAGE_SIZE;
> > + unsigned long next;
> > + pgd_t *pgd;
> > +
> > + VM_BUG_ON(!IS_ALIGNED(start, PAGE_SIZE));
> > + VM_BUG_ON(!IS_ALIGNED(end, PAGE_SIZE));
> > +
> > + walk->reuse_page = NULL;
> > + walk->reuse_addr = addr;
>
> With the change I suggested above, struct vmemmap_remap_walk should be
> initialitzed at once in vmemmap_remap_free, so this should not longer be needed.
You are right.
> (And btw, you do not need to set reuse_page to NULL, the way you init the struct
> in vmemmap_remap_free makes sure to null any field you do not explicitly set).
>
>
> > +static void vmemmap_remap_pte(pte_t *pte, unsigned long addr,
> > + struct vmemmap_remap_walk *walk)
> > +{
> > + /*
> > + * Make the tail pages are mapped with read-only to catch
> > + * illegal write operation to the tail pages.
> "Remap the tail pages as read-only to ..."
Thanks.
>
> > + */
> > + pgprot_t pgprot = PAGE_KERNEL_RO;
> > + pte_t entry = mk_pte(walk->reuse_page, pgprot);
> > + struct page *page;
> > +
> > + page = pte_page(*pte);
>
> struct page *page = pte_page(*pte);
>
> since you did the same for the other two.
Yeah. Will change to this.
>
> > + list_add(&page->lru, walk->vmemmap_pages);
> > +
> > + set_pte_at(&init_mm, addr, pte, entry);
> > +}
> > +
> > +/**
> > + * vmemmap_remap_free - remap the vmemmap virtual address range
> > + * [start, start + size) to the page which
> > + * [start - PAGE_SIZE, start) is mapped,
> > + * then free vmemmap pages.
> > + * @start: start address of the vmemmap virtual address range
> > + * @size: size of the vmemmap virtual address range
> > + */
> > +void vmemmap_remap_free(unsigned long start, unsigned long size)
> > +{
> > + unsigned long end = start + size;
> > + LIST_HEAD(vmemmap_pages);
> > +
> > + struct vmemmap_remap_walk walk = {
> > + .remap_pte = vmemmap_remap_pte,
> > + .vmemmap_pages = &vmemmap_pages,
> > + };
>
> As stated above, this would become:
>
> void vmemmap_remap_free(unsigned long start, unsigned long end,
> usigned long reuse)
> {
> LIST_HEAD(vmemmap_pages);
> struct vmemmap_remap_walk walk = {
> .reuse_addr = reuse,
> .remap_pte = vmemmap_remap_pte,
> .vmemmap_pages = &vmemmap_pages,
> };
>
> You might have had your reasons to do this way, but this looks more natural
> to me, with the plus that callers of vmemmap_remap_free can specify
> what they want to remap.
Should we add a BUG_ON in vmemmap_remap_free() for now?
BUG_ON(reuse != start + PAGE_SIZE);
>
>
> --
> Oscar Salvador
> SUSE L3
--
Yours,
Muchun
Powered by blists - more mailing lists