lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Tue, 25 May 2021 13:08:11 +0000
From:   HORIGUCHI NAOYA(堀口 直也) 
        <naoya.horiguchi@....com>
To:     Oscar Salvador <osalvador@...e.de>
CC:     Mike Kravetz <mike.kravetz@...cle.com>,
        Naoya Horiguchi <nao.horiguchi@...il.com>,
        Muchun Song <songmuchun@...edance.com>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Michal Hocko <mhocko@...e.com>,
        Tony Luck <tony.luck@...el.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v5 1/2] mm,hwpoison: fix race with hugetlb page allocation

On Tue, May 25, 2021 at 11:09:18AM +0200, Oscar Salvador wrote:
> On Tue, May 25, 2021 at 08:07:07AM +0000, HORIGUCHI NAOYA(堀口 直也) wrote:
> > OK, here's the current draft.
> > 
> > Thanks,
> > Naoya Horiguchi
> > 
> > ---
> > From: Naoya Horiguchi <naoya.horiguchi@....com>
> > Date: Tue, 18 May 2021 23:49:18 +0900
> > Subject: [PATCH] mm,hwpoison: fix race with hugetlb page allocation
> > 
> > When hugetlb page fault (under overcommitting situation) and
> > memory_failure() race, VM_BUG_ON_PAGE() is triggered by the following race:
> > 
> >     CPU0:                           CPU1:
> > 
> >                                     gather_surplus_pages()
> >                                       page = alloc_surplus_huge_page()
> >     memory_failure_hugetlb()
> >       get_hwpoison_page(page)
> >         __get_hwpoison_page(page)
> >           get_page_unless_zero(page)
> >                                       zero = put_page_testzero(page)
> >                                       VM_BUG_ON_PAGE(!zero, page)
> >                                       enqueue_huge_page(h, page)
> >       put_page(page)
> > 
> > __get_hwpoison_page() only checks the page refcount before taking an
> > additional one for memory error handling, which is wrong because there's
> > a time window where compound pages have non-zero refcount during
> > initialization.  So make __get_hwpoison_page() check page status a bit
> > more for hugetlb pages.
> 
> I think that this changelog would benefit from some information about the new
> !PageLRU && !__PageMovable check.

OK, I'll update about this check.

> >  static int __get_hwpoison_page(struct page *page)
> >  {
> >  	struct page *head = compound_head(page);
> > +	int ret = 0;
> > +	bool hugetlb = false;
> > +
> > +	ret = get_hwpoison_huge_page(head, &hugetlb);
> > +	if (hugetlb)
> > +		return ret;
> > +
> > +	if (!PageLRU(head) && !__PageMovable(head))
> > +		return 0;
> 
> This definitely needs a comment hinting the reader why we need to check for this.
> AFAICS, this is to close the race where a page is about to be a hugetlb page soon,
> so we do not go for get_page_unless_zero(), right?

Right, I can't find any other reliable check to show that a page is not
under initialization of hugetlb pages. I'll state why this check is needed.

> 
> From soft_offline_page's POV I __guess__ that's fine because we only deal with
> pages we know about.
> But what about memory_failure()? I think memory_failure() is less picky about that,
> so it is okay to not take a refcount on that case?

Actually the coverage of page types for which memory error can be handled
are the same between memory_failure() and soft_offline_page().  One
memory_failure-specific role is to print out logs about error events even if
they can't be successfully handled, which could be affected by this change,
so we might need to adjust how memory_failure() uses the return value of
get_hwpoison_page().

Thanks,
Naoya Horiguchi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ