lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 14 Jan 2021 18:57:16 +0800 From: Muchun Song <songmuchun@...edance.com> To: Oscar Salvador <osalvador@...e.de> Cc: Jonathan Corbet <corbet@....net>, Mike Kravetz <mike.kravetz@...cle.com>, Thomas Gleixner <tglx@...utronix.de>, mingo@...hat.com, bp@...en8.de, x86@...nel.org, hpa@...or.com, dave.hansen@...ux.intel.com, luto@...nel.org, Peter Zijlstra <peterz@...radead.org>, viro@...iv.linux.org.uk, Andrew Morton <akpm@...ux-foundation.org>, paulmck@...nel.org, mchehab+huawei@...nel.org, pawan.kumar.gupta@...ux.intel.com, Randy Dunlap <rdunlap@...radead.org>, oneukum@...e.com, anshuman.khandual@....com, jroedel@...e.de, Mina Almasry <almasrymina@...gle.com>, David Rientjes <rientjes@...gle.com>, Matthew Wilcox <willy@...radead.org>, Michal Hocko <mhocko@...e.com>, "Song Bao Hua (Barry Song)" <song.bao.hua@...ilicon.com>, David Hildenbrand <david@...hat.com>, HORIGUCHI NAOYA(堀口 直也) <naoya.horiguchi@....com>, Xiongchun duan <duanxiongchun@...edance.com>, linux-doc@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>, Linux Memory Management List <linux-mm@...ck.org>, linux-fsdevel <linux-fsdevel@...r.kernel.org> Subject: Re: [External] Re: [PATCH v12 04/13] mm/hugetlb: Free the vmemmap pages associated with each HugeTLB page On Tue, Jan 12, 2021 at 4:05 PM Oscar Salvador <osalvador@...e.de> wrote: > > On Wed, Jan 06, 2021 at 10:19:22PM +0800, Muchun Song wrote: > > Every HugeTLB has more than one struct page structure. We __know__ that > > we only use the first 4(HUGETLB_CGROUP_MIN_ORDER) struct page structures > > to store metadata associated with each HugeTLB. > > > > There are a lot of struct page structures associated with each HugeTLB > > page. For tail pages, the value of compound_head is the same. So we can > > reuse first page of tail page structures. We map the virtual addresses > > of the remaining pages of tail page structures to the first tail page > > struct, and then free these page frames. Therefore, we need to reserve > > two pages as vmemmap areas. > > > > When we allocate a HugeTLB page from the buddy, we can free some vmemmap > > pages associated with each HugeTLB page. It is more appropriate to do it > > in the prep_new_huge_page(). > > > > The free_vmemmap_pages_per_hpage(), which indicates how many vmemmap > > pages associated with a HugeTLB page can be freed, returns zero for > > now, which means the feature is disabled. We will enable it once all > > the infrastructure is there. > > > > Signed-off-by: Muchun Song <songmuchun@...edance.com> > > My memory may betray me after vacation, so bear with me. > > > +/* > > + * Any memory allocated via the memblock allocator and not via the > > + * buddy will be marked reserved already in the memmap. For those > > + * pages, we can call this function to free it to buddy allocator. > > + */ > > +static inline void free_bootmem_page(struct page *page) > > +{ > > + unsigned long magic = (unsigned long)page->freelist; > > + > > + /* > > + * The reserve_bootmem_region sets the reserved flag on bootmem > > + * pages. > > + */ > > + VM_WARN_ON_PAGE(page_ref_count(page) != 2, page); > > I have been thinking about this some more. > And while I think that this macro might have its room somewhere, I do not > think this is the case. > > Here, if we see that page's refcount differs from 2 it means that we had an > earlier corruption. > Now, as a person that has dealt with debugging memory corruptions, I think it > is of no use to proceed further if such corruption happened, as this can lead > to problems somewhere else that can manifest in funny ways, and you will find > yourself scratching your head and trying to work out what happened. > > I am aware that this is not the root of the problem here, as someone might have > had to decrease the refcount, but I would definitely change this to its > VM_BUG_ON_* variant. > > > --- /dev/null > > +++ b/mm/hugetlb_vmemmap.c > > [...] > > > diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h > > new file mode 100644 > > index 000000000000..6923f03534d5 > > --- /dev/null > > +++ b/mm/hugetlb_vmemmap.h > > [...] > > > +/** > > + * vmemmap_remap_free - remap the vmemmap virtual address range [@start, @end) > > + * to the page which @reuse is mapped, then free vmemmap > > + * pages. > > + * @start: start address of the vmemmap virtual address range. > > + * @end: end address of the vmemmap virtual address range. > > + * @reuse: reuse address. > > + */ > > +void vmemmap_remap_free(unsigned long start, unsigned long end, > > + unsigned long reuse) > > +{ > > + LIST_HEAD(vmemmap_pages); > > + struct vmemmap_remap_walk walk = { > > + .remap_pte = vmemmap_remap_pte, > > + .reuse_addr = reuse, > > + .vmemmap_pages = &vmemmap_pages, > > + }; > > + > > + BUG_ON(start != reuse + PAGE_SIZE); > > It seems a bit odd to only pass "start" for the BUG_ON. > Also, I kind of dislike the "addr += PAGE_SIZE" in vmemmap_pte_range. > > I wonder if adding a ".remap_start_addr" would make more sense. > And adding it here with the vmemmap_remap_walk init. How would vmemmap_pte_range look? If we introduce vmemmap_remap_walk, "addr += PAGE_SIZE" can drop? > > > -- > Oscar Salvador > SUSE L3
Powered by blists - more mailing lists