lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <11b03fcd-c210-032c-16d2-79ada41e0349@arm.com>
Date:   Mon, 20 Jul 2020 11:52:21 +0530
From:   Anshuman Khandual <anshuman.khandual@....com>
To:     Mike Kravetz <mike.kravetz@...cle.com>,
        Will Deacon <will@...nel.org>, Roman Gushchin <guro@...com>
Cc:     Barry Song <song.bao.hua@...ilicon.com>,
        Catalin Marinas <catalin.marinas@....com>, x86@...nel.org,
        linux-kernel@...r.kernel.org, linuxarm@...wei.com,
        linux-mm@...ck.org, Ingo Molnar <mingo@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Jonathan Cameron <jonathan.cameron@...wei.com>,
        "H.Peter Anvin" <hpa@...or.com>, Borislav Petkov <bp@...en8.de>,
        akpm@...ux-foundation.org, Mike Rapoport <rppt@...ux.ibm.com>,
        linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH v3] mm/hugetlb: split hugetlb_cma in nodes with memory



On 07/17/2020 10:32 PM, Mike Kravetz wrote:
> On 7/16/20 10:02 PM, Anshuman Khandual wrote:
>>
>>
>> On 07/16/2020 11:55 PM, Mike Kravetz wrote:
>>> >From 17c8f37afbf42fe7412e6eebb3619c6e0b7e1c3c Mon Sep 17 00:00:00 2001
>>> From: Mike Kravetz <mike.kravetz@...cle.com>
>>> Date: Tue, 14 Jul 2020 15:54:46 -0700
>>> Subject: [PATCH] hugetlb: move cma reservation to code setting up gigantic
>>>  hstate
>>>
>>> Instead of calling hugetlb_cma_reserve() directly from arch specific
>>> code, call from hugetlb_add_hstate when adding a gigantic hstate.
>>> hugetlb_add_hstate is either called from arch specific huge page setup,
>>> or as the result of hugetlb command line processing.  In either case,
>>> this is late enough in the init process that all numa memory information
>>> should be initialized.  And, it is early enough to still use early
>>> memory allocator.
>>
>> This assumes that hugetlb_add_hstate() is called from the arch code at
>> the right point in time for the generic HugeTLB to do the required CMA
>> reservation which is not ideal. I guess it must have been a reason why
>> CMA reservation should always called by the platform code which knows
>> the boot sequence timing better.
> 
> Actually, the code does not make the assumption that hugetlb_add_hstate
> is called from arch specific huge page setup.  It can even be called later
> at the time of hugetlb command line processing.

Yes, now that hugetlb_cma_reserve() has been moved into hugetlb_add_hstate().
But then there is an explicit warning while trying to mix both the command
line options i.e hugepagesz= and hugetlb_cma=. The proposed code here have
not changed that behavior and hence the following warning should have been
triggered here as well.

1) hugepagesz_setup()
	hugetlb_add_hstate()
		 hugetlb_cma_reserve()

2) hugepages_setup()
	hugetlb_hstate_alloc_pages()	when order >= MAX_ORDER

	if (hstate_is_gigantic(h)) {
        	if (IS_ENABLED(CONFIG_CMA) && hugetlb_cma[0]) {
                	pr_warn_once("HugeTLB: hugetlb_cma is enabled, skip boot time allocation\n");
			break;
                }
		if (!alloc_bootmem_huge_page(h))
                break;
	}

Nonetheless, it does not make sense to mix both memblock and CMA based huge
page pre-allocations. But looking at this again, could this warning be ever
triggered till now ? Unless, a given platform calls hugetlb_cma_reserve()
before _setup("hugepages=", hugepages_setup). Anyways, there seems to be
good reasons to keep both memblock and CMA based pre-allocations in place.
But mixing them together (as done in the proposed code here) does not seem
to be right.

> 
> My 'reasoning' is that gigantic pages can currently be preallocated from
> bootmem/memblock_alloc at the time of command line processing.  Therefore,
> we should be able to reserve bootmem for CMA at the same time.  Is there
> something wrong with this reasoning?  I tested this on x86 by removing the
> call to hugetlb_add_hstate from arch specific code and instead forced the
> call at command line processing time.  The ability to reserve CMA was the
> same.

There is no problem with that reasoning. __setup() triggered function should
be able perform CMA reservation. But as pointed out before, it does not make
sense to mix both CMA reservation and memblock based pre-allocation.

> 
> Yes, the CMA reservation interface says it should be called from arch
> specific code.  However, if we currently depend on the ability to do
> memblock_alloc at hugetlb command line processing time for gigantic page
> preallocation, then I think we can do the CMA reservation here as well.

IIUC, CMA reservation and memblock alloc have some differences in terms of
how the memory can be used later on, will have to dig deeper on this. But
the comment section near cma_declare_contiguous_nid() is a concern.

 * This function reserves memory from early allocator. It should be
 * called by arch specific code once the early allocator (memblock or bootmem)
 * has been activated and all other subsystems have already allocated/reserved
 * memory. This function allows to create custom reserved areas.

> 
> Thinking about it some more, I suppose there could be some arch code that
> could call hugetlb_add_hstate too early in the boot process.  But, I do
> not think we have an issue with calling it too late.
> 

Calling it too late might have got the page allocator initialized completely
and then CMA reservation would not be possible afterwards. Also calling it
too early would prevent other subsystems which might need memory reservation
in specific physical ranges.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ