linux-kernel - Re: [PATCH v6 1/2] mm,hwpoison: fix race with hugetlb page allocation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210812042813.GA1576603@agluck-desk2.amr.corp.intel.com>
Date:   Wed, 11 Aug 2021 21:28:13 -0700
From:   "Luck, Tony" <tony.luck@...el.com>
To:     Naoya Horiguchi <nao.horiguchi@...il.com>
Cc:     Oscar Salvador <osalvador@...e.de>,
        Muchun Song <songmuchun@...edance.com>,
        Mike Kravetz <mike.kravetz@...cle.com>, linux-mm@...ck.org,
        Andrew Morton <akpm@...ux-foundation.org>,
        Michal Hocko <mhocko@...e.com>,
        Naoya Horiguchi <naoya.horiguchi@....com>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v6 1/2] mm,hwpoison: fix race with hugetlb page allocation

On Fri, Jun 04, 2021 at 08:36:31AM +0900, Naoya Horiguchi wrote:
> From: Naoya Horiguchi <naoya.horiguchi@....com>
> 
> When hugetlb page fault (under overcommitting situation) and
> memory_failure() race, VM_BUG_ON_PAGE() is triggered by the following race:
> 
>     CPU0:                           CPU1:
> 
>                                     gather_surplus_pages()
>                                       page = alloc_surplus_huge_page()
>     memory_failure_hugetlb()
>       get_hwpoison_page(page)
>         __get_hwpoison_page(page)
>           get_page_unless_zero(page)
>                                       zero = put_page_testzero(page)
>                                       VM_BUG_ON_PAGE(!zero, page)
>                                       enqueue_huge_page(h, page)
>       put_page(page)
> 
> __get_hwpoison_page() only checks the page refcount before taking an
> additional one for memory error handling, which is not enough because
> there's a time window where compound pages have non-zero refcount during
> hugetlb page initialization.
> 
> So make __get_hwpoison_page() check page status a bit more for hugetlb
> pages with get_hwpoison_huge_page(). Checking hugetlb-specific flags
> under hugetlb_lock makes sure that the hugetlb page is not transitive.
> It's notable that another new function, HWPoisonHandlable(), is helpful
> to prevent a race against other transitive page states (like a generic
> compound page just before PageHuge becomes true).

I'm seeing some strange results when doing a simple injection/recovery.

Current upstream often fails to offline the page with messages like:

	"high-order kernel page"
or
	"unknown page"

Things were working in v5.12. Broken in v5.13.

Bisect says that:

25182f05ffed ("mm,hwpoison: fix race with hugetlb page allocation")

is the culprit (though it is possible that there is more than one
issue ... failure symptoms changed a bit during the bisection).

This commit doesn't revert automatically from upstream. But it
does revert from v5.13. Running with this reverted from v5.13
gives kernel that recovers normally[1] from hundreds of consecutive
error injections.

-Tony

[1] Almost normally. My test catches SIGBUS and prints the virtual
address from the siginfo_t structure. Sometimes the address is correct
other times it is NULL.