[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMZfGtVjLOF27VMVJ5fF8CDJRpZ0t7fWpmMHB9D-ipMK6b=POg@mail.gmail.com>
Date: Thu, 11 Mar 2021 12:26:32 +0800
From: Muchun Song <songmuchun@...edance.com>
To: Michal Hocko <mhocko@...e.com>
Cc: Jonathan Corbet <corbet@....net>,
Mike Kravetz <mike.kravetz@...cle.com>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, bp@...en8.de, x86@...nel.org,
hpa@...or.com, dave.hansen@...ux.intel.com, luto@...nel.org,
Peter Zijlstra <peterz@...radead.org>,
Alexander Viro <viro@...iv.linux.org.uk>,
Andrew Morton <akpm@...ux-foundation.org>, paulmck@...nel.org,
mchehab+huawei@...nel.org, pawan.kumar.gupta@...ux.intel.com,
Randy Dunlap <rdunlap@...radead.org>, oneukum@...e.com,
anshuman.khandual@....com, jroedel@...e.de,
Mina Almasry <almasrymina@...gle.com>,
David Rientjes <rientjes@...gle.com>,
Matthew Wilcox <willy@...radead.org>,
Oscar Salvador <osalvador@...e.de>,
"Song Bao Hua (Barry Song)" <song.bao.hua@...ilicon.com>,
David Hildenbrand <david@...hat.com>,
HORIGUCHI NAOYA(堀口 直也)
<naoya.horiguchi@....com>,
Joao Martins <joao.m.martins@...cle.com>,
Xiongchun duan <duanxiongchun@...edance.com>,
linux-doc@...r.kernel.org, LKML <linux-kernel@...r.kernel.org>,
Linux Memory Management List <linux-mm@...ck.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
Chen Huang <chenhuang5@...wei.com>,
Bodeddula Balasubramaniam <bodeddub@...zon.com>
Subject: Re: [External] Re: [PATCH v18 4/9] mm: hugetlb: alloc the vmemmap
pages associated with each HugeTLB page
On Wed, Mar 10, 2021 at 11:19 PM Michal Hocko <mhocko@...e.com> wrote:
>
> On Mon 08-03-21 18:28:02, Muchun Song wrote:
> [...]
> > -static void update_and_free_page(struct hstate *h, struct page *page)
> > +static int update_and_free_page(struct hstate *h, struct page *page)
> > + __releases(&hugetlb_lock) __acquires(&hugetlb_lock)
> > {
> > int i;
> > struct page *subpage = page;
> > + int nid = page_to_nid(page);
> >
> > if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported())
> > - return;
> > + return 0;
> >
> > h->nr_huge_pages--;
> > - h->nr_huge_pages_node[page_to_nid(page)]--;
> > + h->nr_huge_pages_node[nid]--;
> > + VM_BUG_ON_PAGE(hugetlb_cgroup_from_page(page), page);
> > + VM_BUG_ON_PAGE(hugetlb_cgroup_from_page_rsvd(page), page);
>
> > + set_page_refcounted(page);
> > + set_compound_page_dtor(page, NULL_COMPOUND_DTOR);
> > +
> > + /*
> > + * If the vmemmap pages associated with the HugeTLB page can be
> > + * optimized or the page is gigantic, we might block in
> > + * alloc_huge_page_vmemmap() or free_gigantic_page(). In both
> > + * cases, drop the hugetlb_lock.
> > + */
> > + if (free_vmemmap_pages_per_hpage(h) || hstate_is_gigantic(h))
> > + spin_unlock(&hugetlb_lock);
> > +
> > + if (alloc_huge_page_vmemmap(h, page)) {
> > + spin_lock(&hugetlb_lock);
> > + INIT_LIST_HEAD(&page->lru);
> > + set_compound_page_dtor(page, HUGETLB_PAGE_DTOR);
> > + h->nr_huge_pages++;
> > + h->nr_huge_pages_node[nid]++;
> > +
> > + /*
> > + * If we cannot allocate vmemmap pages, just refuse to free the
> > + * page and put the page back on the hugetlb free list and treat
> > + * as a surplus page.
> > + */
> > + h->surplus_huge_pages++;
> > + h->surplus_huge_pages_node[nid]++;
> > +
> > + /*
> > + * The refcount can possibly be increased by memory-failure or
> > + * soft_offline handlers.
>
> This comment could be more helpful. I believe you want to say this
> /*
> * HWpoisoning code can increment the reference
> * count here. If there is a race then bail out
> * the holder of the additional reference count will
> * free up the page with put_page.
Right. I will reuse this. Thanks.
> > + */
> > + if (likely(put_page_testzero(page))) {
> > + arch_clear_hugepage_flags(page);
> > + enqueue_huge_page(h, page);
> > + }
> > +
> > + return -ENOMEM;
> > + }
> > +
> > for (i = 0; i < pages_per_huge_page(h);
> > i++, subpage = mem_map_next(subpage, page, i)) {
> > subpage->flags &= ~(1 << PG_locked | 1 << PG_error |
> [...]
> > @@ -1447,7 +1486,7 @@ void free_huge_page(struct page *page)
> > /*
> > * Defer freeing if in non-task context to avoid hugetlb_lock deadlock.
> > */
> > - if (!in_task()) {
> > + if (in_atomic()) {
>
> As I've said elsewhere in_atomic doesn't work for CONFIG_PREEMPT_COUNT=n.
> We need this change for other reasons and so it would be better to pull
> it out into a separate patch which also makes HUGETLB depend on
> PREEMPT_COUNT.
>
> [...]
> > @@ -1771,8 +1813,12 @@ int dissolve_free_huge_page(struct page *page)
> > h->free_huge_pages--;
> > h->free_huge_pages_node[nid]--;
> > h->max_huge_pages--;
> > - update_and_free_page(h, head);
> > - rc = 0;
> > + rc = update_and_free_page(h, head);
> > + if (rc) {
> > + h->surplus_huge_pages--;
> > + h->surplus_huge_pages_node[nid]--;
> > + h->max_huge_pages++;
>
> This is quite ugly and confusing. update_and_free_page is careful to do
> the proper counters accounting and now you just override it partially.
> Why cannot we rely on update_and_free_page do the right thing?
Dissolving path is special here. Since update_and_free_page failed,
the number of surplus pages was incremented. Surplus pages are
the number of pages greater than max_huge_pages. Since we are
incrementing max_huge_pages, we should decrement (undo) the
addition to surplus_huge_pages and surplus_huge_pages_node[nid].
>
> --
> Michal Hocko
> SUSE Labs
Powered by blists - more mailing lists