lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 27 Jun 2018 16:27:24 -0700
From:   Mike Kravetz <mike.kravetz@...cle.com>
To:     Cannon Matthews <cannonmatthews@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Nadia Yvette Chambers <nyc@...omorphy.com>
Cc:     linux-mm@...ck.org, linux-kernel@...r.kernel.org,
        andreslc@...gle.com, pfeiner@...gle.com, gthelen@...gle.com
Subject: Re: [PATCH] mm: hugetlb: yield when prepping struct pages

On 06/27/2018 02:44 PM, Cannon Matthews wrote:
> When booting with very large numbers of gigantic (i.e. 1G) pages, the
> operations in the loop of gather_bootmem_prealloc, and specifically
> prep_compound_gigantic_page, takes a very long time, and can cause a
> softlockup if enough pages are requested at boot.
> 
> For example booting with 3844 1G pages requires prepping

Wow!  I wish I had a system with that much memory to test. :)

> (set_compound_head, init the count) over 1 billion 4K tail pages, which
> takes considerable time. This should also apply to reserving the same
> amount of memory as 2M pages, as the same number of struct pages
> are affected in either case.

Actually, this change would not apply to 2M (on x86) pages.  The hugetlbfs
initialization code is a bit confusing, but alloc_bootmem_huge_page and
gather_bootmem_prealloc are only exercised in the case where huge page
order >= MAX_ORDER.

Allocation and initialization of 2M pages happens after the normal memory
allocators are setup via the routine hugetlb_hstate_alloc_pages.  And,
there is already a cond_resched in that loop today.

Note that 'else if' in the for loop of hugetlb_hstate_alloc_pages.  This
allows the same routine to be called for early gigantic page allocations
using the bootmem allocator, and later normal (2M) allocations using the
normal memory allocators.  To me, this is a source of confusion and is
something I plan to clean up in the future.

> Add a cond_resched() to the outer loop in gather_bootmem_prealloc() to
> prevent this lockup.
> 
> Tested: Booted with softlockup_panic=1 hugepagesz=1G hugepages=3844 and
> no softlockup is reported, and the hugepages are reported as
> successfully setup.
> 
> Signed-off-by: Cannon Matthews <cannonmatthews@...gle.com>

My only suggestion would be to remove the mention of 2M pages in the
commit message.  Thanks for adding this.

Reviewed-by: Mike Kravetz <mike.kravetz@...cle.com>
-- 
Mike Kravetz

> ---
>  mm/hugetlb.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index a963f2034dfc..d38273c32d3b 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -2169,6 +2169,7 @@ static void __init gather_bootmem_prealloc(void)
>  		 */
>  		if (hstate_is_gigantic(h))
>  			adjust_managed_page_count(page, 1 << h->order);
> +		cond_resched();
>  	}
>  }
>  
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ