[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e7637427-5f17-b4f4-93a2-70cac9b3a264@ghiti.fr>
Date: Wed, 27 Mar 2019 10:48:54 +0100
From: Alexandre Ghiti <alex@...ti.fr>
To: "Aneesh Kumar K.V" <aneesh.kumar@...ux.ibm.com>,
mpe@...erman.id.au, Andrew Morton <akpm@...ux-foundation.org>,
Vlastimil Babka <vbabka@...e.cz>,
Catalin Marinas <catalin.marinas@....com>,
Will Deacon <will.deacon@....com>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Paul Mackerras <paulus@...ba.org>,
Martin Schwidefsky <schwidefsky@...ibm.com>,
Heiko Carstens <heiko.carstens@...ibm.com>,
Yoshinori Sato <ysato@...rs.sourceforge.jp>,
Rich Felker <dalias@...c.org>,
"David S . Miller" <davem@...emloft.net>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
"H . Peter Anvin" <hpa@...or.com>, x86@...nel.org,
Dave Hansen <dave.hansen@...ux.intel.com>,
Andy Lutomirski <luto@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Mike Kravetz <mike.kravetz@...cle.com>,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
linuxppc-dev@...ts.ozlabs.org, linux-s390@...r.kernel.org,
linux-sh@...r.kernel.org, sparclinux@...r.kernel.org,
linux-mm@...ck.org
Subject: Re: [PATCH v8 4/4] hugetlb: allow to free gigantic pages regardless
of the configuration
On 03/27/2019 09:55 AM, Aneesh Kumar K.V wrote:
> On 3/27/19 2:14 PM, Alexandre Ghiti wrote:
>>
>>
>> On 03/27/2019 08:01 AM, Aneesh Kumar K.V wrote:
>>> On 3/27/19 12:06 PM, Alexandre Ghiti wrote:
>>>> On systems without CONTIG_ALLOC activated but that support gigantic
>>>> pages,
>>>> boottime reserved gigantic pages can not be freed at all. This patch
>>>> simply enables the possibility to hand back those pages to memory
>>>> allocator.
>>>>
>>>> Signed-off-by: Alexandre Ghiti <alex@...ti.fr>
>>>> Acked-by: David S. Miller <davem@...emloft.net> [sparc]
>>>>
>>>> diff --git a/arch/powerpc/include/asm/book3s/64/hugetlb.h
>>>> b/arch/powerpc/include/asm/book3s/64/hugetlb.h
>>>> index ec2a55a553c7..7013284f0f1b 100644
>>>> --- a/arch/powerpc/include/asm/book3s/64/hugetlb.h
>>>> +++ b/arch/powerpc/include/asm/book3s/64/hugetlb.h
>>>> @@ -36,8 +36,8 @@ static inline int hstate_get_psize(struct hstate
>>>> *hstate)
>>>> }
>>>> }
>>>> -#ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE
>>>> -static inline bool gigantic_page_supported(void)
>>>> +#define __HAVE_ARCH_GIGANTIC_PAGE_RUNTIME_SUPPORTED
>>>> +static inline bool gigantic_page_runtime_supported(void)
>>>> {
>>>> /*
>>>> * We used gigantic page reservation with hypervisor assist
>>>> in some case.
>>>> @@ -49,7 +49,6 @@ static inline bool gigantic_page_supported(void)
>>>> return true;
>>>> }
>>>> -#endif
>>>> /* hugepd entry valid bit */
>>>> #define HUGEPD_VAL_BITS (0x8000000000000000UL)
>>>
>>> Is that correct when CONTIG_ALLOC is not enabled? I guess we want
>>>
>>> gigantic_page_runtime_supported to return false when CONTIG_ALLOC is
>>> not enabled on all architectures and on POWER when it is enabled we
>>> want it to be conditional as it is now.
>>>
>>> -aneesh
>>>
>>
>> CONFIG_ARCH_HAS_GIGANTIC_PAGE is set by default when an architecture
>> supports gigantic
>> pages: on its own, it allows to allocate boottime gigantic pages AND
>> to free them at runtime
>> (this is the goal of this series), but not to allocate runtime
>> gigantic pages.
>> If CONTIG_ALLOC is set, it allows in addition to allocate runtime
>> gigantic pages.
>>
>> I re-introduced the runtime checks because we can't know at compile
>> time if powerpc can
>> or not support gigantic pages.
>>
>> So for all architectures, gigantic_page_runtime_supported only
>> depends on
>> CONFIG_ARCH_HAS_GIGANTIC_PAGE enabled or not. The possibility to
>> allocate runtime
>> gigantic pages is dealt with after those runtime checks.
>>
>
> you removed that #ifdef in the patch above. ie we had
> #ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE
> static inline bool gigantic_page_supported(void)
> {
> /*
> * We used gigantic page reservation with hypervisor assist in
> some case.
> * We cannot use runtime allocation of gigantic pages in those
> platforms
> * This is hash translation mode LPARs.
> */
> if (firmware_has_feature(FW_FEATURE_LPAR) && !radix_enabled())
> return false;
>
> return true;
> }
> #endif
Yes, I removed the #ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE because it was
defined only
if CONTIG_ALLOC was set. But now, CONFIG_ARCH_HAS_GIGANTIC_PAGE is
inconditionally
set for powerpc so I think we don't need it anymore.
Actually I have doubts now, is this true for all configurations ? I see
that it is only set for
PPC_RADIX_MMU. I think the problem is here: instead of returning true,
it should do like
the generic version, ie return IS_ENABLED(CONFIG_ARCH_HAS_GIGANTIC_PAGE).
Do you agree ?
>
>
> This is now
> #define __HAVE_ARCH_GIGANTIC_PAGE_RUNTIME_SUPPORTED
> static inline bool gigantic_page_runtime_supported(void)
> {
> if (firmware_has_feature(FW_FEATURE_LPAR) && !radix_enabled())
> return false;
>
> return true;
> }
>
>
> I am wondering whether it should be
>
> #define __HAVE_ARCH_GIGANTIC_PAGE_RUNTIME_SUPPORTED
> static inline bool gigantic_page_runtime_supported(void)
> {
>
> if (!IS_ENABLED(CONFIG_CONTIG_ALLOC))
> return false;
I don't think this test should happen here, CONFIG_CONTIG_ALLOC only allows
to allocate gigantic pages, doing that check here would prevent powerpc
to free boottime gigantic pages when not a guest. Note that this check
is actually done in set_max_huge_pages.
>
> if (firmware_has_feature(FW_FEATURE_LPAR) && !radix_enabled())
> return false;
Maybe I did not understand this check: I understood that, in the case
the system
is virtualized, we do not want it to hand back gigantic pages. Does this
check
test if the system is currently being virtualized ?
If yes, I think the patch is correct: it prevents freeing gigantic pages
when the system
is virtualized but allows a 'normal' system to free gigantic pages.
>
> return true;
> }
>
> or add that #ifdef back.
>
>> By the way, I forgot to ask you why you think that if an arch cannot
>> allocate runtime gigantic
>> pages, it should not be able to free boottime gigantic pages ?
>>
>
> on virtualized platforms like powervm which use a paravirtualized page
> table update mechanism (we dont' have two level table). The ability to
> map a page huge depends on how hypervisor allocated the guest ram.
> Hypervisor also allocates the guest specific page table of a specific
> size depending on how many pages are going to be mapped by what page
> size.
>
> on POWER we indicate possible guest real address that can be mapped
> via hugepage (in this case 16G) using a device tree node
> (ibm,expected#pages) . It is expected that we will map these pages
> only as 16G pages. Hence we cannot free them back to the buddy where
> it could get mapped via 64K page size.
Thanks for the explanations.
Alex
>
> -aneesh
>
>
Powered by blists - more mailing lists