[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <383a72f7-b9d3-963f-e5fb-b0036376399e@bytedance.com>
Date: Thu, 27 Jul 2023 21:56:28 +0100
From: Usama Arif <usama.arif@...edance.com>
To: Mike Rapoport <rppt@...nel.org>
Cc: linux-mm@...ck.org, muchun.song@...ux.dev, mike.kravetz@...cle.com,
linux-kernel@...r.kernel.org, fam.zheng@...edance.com,
liangma@...ngbit.com, simon.evans@...edance.com,
punit.agrawal@...edance.com
Subject: Re: [External] Re: [RFC 2/4] mm/memblock: Add hugepage_size member to
struct memblock_region
On 27/07/2023 05:30, Mike Rapoport wrote:
> On Wed, Jul 26, 2023 at 04:02:21PM +0100, Usama Arif wrote:
>>
>> On 26/07/2023 12:01, Mike Rapoport wrote:
>>> On Mon, Jul 24, 2023 at 02:46:42PM +0100, Usama Arif wrote:
>>>> This propagates the hugepage size from the memblock APIs
>>>> (memblock_alloc_try_nid_raw and memblock_alloc_range_nid)
>>>> so that it can be stored in struct memblock region. This does not
>>>> introduce any functional change and hugepage_size is not used in
>>>> this commit. It is just a setup for the next commit where huge_pagesize
>>>> is used to skip initialization of struct pages that will be freed later
>>>> when HVO is enabled.
>>>>
>>>> Signed-off-by: Usama Arif <usama.arif@...edance.com>
>>>> ---
>>>> arch/arm64/mm/kasan_init.c | 2 +-
>>>> arch/powerpc/platforms/pasemi/iommu.c | 2 +-
>>>> arch/powerpc/platforms/pseries/setup.c | 4 +-
>>>> arch/powerpc/sysdev/dart_iommu.c | 2 +-
>>>> include/linux/memblock.h | 8 ++-
>>>> mm/cma.c | 4 +-
>>>> mm/hugetlb.c | 6 +-
>>>> mm/memblock.c | 60 ++++++++++++--------
>>>> mm/mm_init.c | 2 +-
>>>> mm/sparse-vmemmap.c | 2 +-
>>>> tools/testing/memblock/tests/alloc_nid_api.c | 2 +-
>>>> 11 files changed, 56 insertions(+), 38 deletions(-)
>>>>
>>>
>>> [ snip ]
>>>
>>>> diff --git a/include/linux/memblock.h b/include/linux/memblock.h
>>>> index f71ff9f0ec81..bb8019540d73 100644
>>>> --- a/include/linux/memblock.h
>>>> +++ b/include/linux/memblock.h
>>>> @@ -63,6 +63,7 @@ struct memblock_region {
>>>> #ifdef CONFIG_NUMA
>>>> int nid;
>>>> #endif
>>>> + phys_addr_t hugepage_size;
>>>> };
>>>> /**
>>>> @@ -400,7 +401,8 @@ phys_addr_t memblock_phys_alloc_range(phys_addr_t size, phys_addr_t align,
>>>> phys_addr_t start, phys_addr_t end);
>>>> phys_addr_t memblock_alloc_range_nid(phys_addr_t size,
>>>> phys_addr_t align, phys_addr_t start,
>>>> - phys_addr_t end, int nid, bool exact_nid);
>>>> + phys_addr_t end, int nid, bool exact_nid,
>>>> + phys_addr_t hugepage_size);
>>>
>>> Rather than adding yet another parameter to memblock_phys_alloc_range() we
>>> can have an API that sets a flag on the reserved regions.
>>> With this the hugetlb reservation code can set a flag when HVO is
>>> enabled and memmap_init_reserved_pages() will skip regions with this flag
>>> set.
>>>
>>
>> Hi,
>>
>> Thanks for the review.
>>
>> I think you meant memblock_alloc_range_nid/memblock_alloc_try_nid_raw and
>> not memblock_phys_alloc_range?
>
> Yes.
>
>> My initial approach was to use flags, but I think it looks worse than what I
>> have done in this RFC (I have pushed the flags prototype at
>> https://github.com/uarif1/linux/commits/flags_skip_prep_init_gigantic_HVO,
>> top 4 commits for reference (the main difference is patch 2 and 4 from
>> RFC)). The major points are (the bigger issue is in patch 4):
>>
>> - (RFC vs flags patch 2 comparison) In the RFC, hugepage_size is propagated
>> from memblock_alloc_try_nid_raw through function calls. When using flags,
>> the "no_init" boolean is propogated from memblock_alloc_try_nid_raw through
>> function calls until the region flags are available in memblock_add_range
>> and the new MEMBLOCK_NOINIT flag is set. I think its a bit more tricky to
>> introduce a new function to set the flag in the region AFTER the call to
>> memblock_alloc_try_nid_raw has finished as the memblock_region can not be
>> found.
>> So something (hugepage_size/flag information) still has to be propagated
>> through function calls and a new argument needs to be added.
>
> Sorry if I wasn't clear. I didn't mean to add flags parameter, I meant to
> add a flag and a function that sets this flag for a range. So for
> MEMBLOCK_NOINIT there would be
>
> int memblock_mark_noinit(phys_addr_t base, phys_addr_t size);
>
> I'd just name this flag MEMBLOCK_RSRV_NOINIT to make it clear it controls
> the reserved regions.
>
> This won't require updating all call sites of memblock_alloc_range_nid()
> and memblock_alloc_try_nid_raw() but only a small refactoring of
> memblock_setclr_flag() and its callers.
>
Thanks for this, its much cleaner doing the way you described. I have
sent v1 implementing this
https://lore.kernel.org/all/20230727204624.1942372-1-usama.arif@bytedance.com/.
Regards,
Usama
Powered by blists - more mailing lists