[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f5efa19f-7eb9-4ffd-aa12-6aae19379cf8@arm.com>
Date: Tue, 29 Jul 2025 21:56:16 +0530
From: Dev Jain <dev.jain@....com>
To: Justin He <Justin.He@....com>, Catalin Marinas <Catalin.Marinas@....com>,
Will Deacon <will@...nel.org>, Andrew Morton <akpm@...ux-foundation.org>,
Uladzislau Rezki <urezki@...il.com>
Cc: Anshuman Khandual <Anshuman.Khandual@....com>,
Ryan Roberts <Ryan.Roberts@....com>, Peter Xu <peterx@...hat.com>,
Joey Gouly <Joey.Gouly@....com>, Yicong Yang <yangyicong@...ilicon.com>,
"Matthew Wilcox (Oracle)" <willy@...radead.org>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>
Subject: Re: [PATCH] mm: vmalloc: use VMALLOC_EARLY_START boundary for early
vmap area
On 28/07/25 11:49 am, Justin He wrote:
> Hi Dev,
>
>> -----Original Message-----
>> From: Dev Jain <Dev.Jain@....com>
>> Sent: Tuesday, July 22, 2025 2:48 PM
>> To: Justin He <Justin.He@....com>; Catalin Marinas
>> <Catalin.Marinas@....com>; Will Deacon <will@...nel.org>; Andrew
>> Morton <akpm@...ux-foundation.org>; Uladzislau Rezki <urezki@...il.com>
>> Cc: Anshuman Khandual <Anshuman.Khandual@....com>; Ryan Roberts
>> <Ryan.Roberts@....com>; Peter Xu <peterx@...hat.com>; Joey Gouly
>> <Joey.Gouly@....com>; Yicong Yang <yangyicong@...ilicon.com>; Matthew
>> Wilcox (Oracle) <willy@...radead.org>; linux-arm-kernel@...ts.infradead.org;
>> linux-kernel@...r.kernel.org; linux-mm@...ck.org
>> Subject: Re: [PATCH] mm: vmalloc: use VMALLOC_EARLY_START boundary for
>> early vmap area
>>
>>
>> On 22/07/25 9:38 am, Jia He wrote:
>>> When VMALLOC_START is redefined to a new boundary, most subsystems
>>> continue to function correctly. However, vm_area_register_early()
>>> assumes the use of the global _vmlist_ structure before vmalloc_init()
>>> is invoked. This assumption can lead to issues during early boot.
>>>
>>> See the calltrace as follows:
>>> start_kernel()
>>> setup_per_cpu_areas()
>>> pcpu_page_first_chunk()
>>> vm_area_register_early()
>>> mm_core_init()
>>> vmalloc_init()
>>>
>>> The early vm areas will be added to vmlist at declare_kernel_vmas()
>>> ->declare_vma():
>>> ffff800080010000 T _stext
>>> ffff800080da0000 D __start_rodata
>>> ffff800081890000 T __inittext_begin
>>> ffff800081980000 D __initdata_begin
>>> ffff800081ee0000 D _data
>>> The starting address of the early areas is tied to the *old*
>>> VMALLOC_START (i.e. 0xffff800080000000 on an arm64 N2 server).
>>>
>>> If VMALLOC_START is redefined, it can disrupt early VM area
>>> allocation, particularly in like pcpu_page_first_chunk()-
>>> vm_area_register_early().
>>>
>>> To address this potential risk on arm64, introduce a new boundary,
>>> VMALLOC_EARLY_START, to avoid boot issues when VMALLOC_START is
>>> occasionaly redefined.
>> Sorry but I am unable to understand the point of the patch. If a particular
>> value of VMALLOC_START causes a problem because the vma declarations of
>> the kernel are tied to that value, surely we should be reasoning about what
>> was wrong about the new value, and not circumventing the actual problem by
>> introducing VMALLOC_EARLY_START?
>>
>> Also by your patch description I don't think you have run into a reproducible
>> boot issue, so this patch is basically adding dead code because both macros
>> are defined to MODULES_END?
>>
> Please try this *debugging* purpose patch which can trigger the boot panic
> more easily(I can always reproduce the boot panic on an ARM64 server):
>
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index 192d86e1cc76..2be8db8d0205 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -20,7 +20,8 @@
> * VMALLOC_START: beginning of the kernel vmalloc space
> * VMALLOC_END: extends to the available space below vmemmap
> */
> -#define VMALLOC_START (MODULES_END)
> +//#define VMALLOC_START (MODULES_END)
> +#define VMALLOC_START ((MODULES_END & PGDIR_MASK) + PGDIR_SIZE)
> #if VA_BITS == VA_BITS_MIN
> #define VMALLOC_END (VMEMMAP_START - SZ_8M)
> #else
> diff --git a/mm/percpu.c b/mm/percpu.c
> index b35494c8ede2..53d187172b5e 100644
> --- a/mm/percpu.c
> +++ b/mm/percpu.c
> @@ -3051,7 +3051,7 @@ int __init pcpu_embed_first_chunk(size_t reserved_size, size_t dyn_size,
> max_distance += ai->unit_size * ai->groups[highest_group].nr_units;
>
> /* warn if maximum distance is further than 75% of vmalloc space */
> - if (max_distance > VMALLOC_TOTAL * 3 / 4) {
> + if (1 || max_distance > VMALLOC_TOTAL * 3 / 4) {
This will always return true - which leads to returning -EINVAL and then
panicking in setup_per_cpu_areas(). Probably you made this change by mistake
and are trying to say that the redefinition above panics?
> pr_warn("max_distance=0x%lx too large for vmalloc space 0x%lx\n",
> max_distance, VMALLOC_TOTAL);
> #ifdef CONFIG_NEED_PER_CPU_PAGE_FIRST_CHUNK
>
>
> ---
> Cheers,
> Justin He(Jia He)
>
>
Powered by blists - more mailing lists