lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3c536d3c-a180-301b-5cb7-c737a178a9d7@oracle.com>
Date:   Tue, 23 Feb 2021 10:06:12 -0800
From:   Mike Kravetz <mike.kravetz@...cle.com>
To:     Gerald Schaefer <gerald.schaefer@...ux.ibm.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, Michal Hocko <mhocko@...e.com>,
        Heiko Carstens <hca@...ux.ibm.com>,
        Sven Schnelle <svens@...ux.ibm.com>
Subject: Re: [RFC] linux-next panic in hugepage_subpool_put_pages()

On 2/23/21 6:57 AM, Gerald Schaefer wrote:
> Hi,
> 
> LTP triggered a panic on s390 in hugepage_subpool_put_pages() with
> linux-next 5.12.0-20210222, see below.
> 
> It crashes on the spin_lock(&spool->lock) at the beginning, because the
> passed-in *spool points to 0000004e00000000, which is not addressable
> memory. It rather looks like some flags and not a proper address. I suspect
> some relation to the recent rework in that area, e.g. commit f1280272ae4d
> ("hugetlb: use page.private for hugetlb specific page flags").
> 
> __free_huge_page() calls hugepage_subpool_put_pages() and takes *spool from
> hugetlb_page_subpool(page), which was changed by that commit to use
> page[1]->private now.
> 

Thanks Gerald,

Yes, I believe f1280272ae4d is the root cause of this issue.  In that
commit, the subpool pointer was moved from page->private of the head
page to page->private of the first subpage.  The page allocator will
initialize (zero) the private field of the head page, but not that of
subpages.  So, that bad subpool pointer is likely an old page->private
value for the page.

That strange call path from set_max_huge_pages to __free_huge_page is
actually how the code puts newly allocated pages on it's interfal free
list.  

I will do a bit more verification and put together a patch (it should
be simple).
-- 
Mike Kravetz

Powered by blists - more mailing lists