[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240903164532.3874988-1-scott@os.amperecomputing.com>
Date: Tue, 3 Sep 2024 09:45:32 -0700
From: D Scott Phillips <scott@...amperecomputing.com>
To: Catalin M arinas <catalin.marinas@....com>,
Mark Rutland <mark.rutland@....com>,
Will Deacon <will@...nel.org>,
Ard Biesheuvel <ardb@...nel.org>,
linux-arm-kernel@...ts.infradead.org
Cc: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
AKASHI Takahiro <takahiro.akashi@...aro.org>,
Alison Schofield <alison.schofield@...el.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Andrey Konovalov <andreyknvl@...il.com>,
Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
Ankit Agrawal <ankita@...dia.com>,
Baoquan He <bhe@...hat.com>,
Dan Williams <dan.j.williams@...el.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Wang Jinchao <wangjinchao@...sion.com>,
linux-kernel@...r.kernel.org,
patches@...erecomputing.com
Subject: [PATCH v3] arm64: Expose the end of the linear map in PHYSMEM_END
The memory hot-plug and resource management code needs to know the
largest address which can fit in the linear map, so set
PHYSMEM_END for that purpose.
This fixes a crash[1] at boot when amdgpu tries to create
DEVICE_PRIVATE_MEMORY and is given a physical address by the
resource management code which is outside the range which can have
a `struct page`
The Fixes: commit listed below isn't actually broken, but the
reorganization of vmemmap causes the improper DEVICE_PRIVATE_MEMORY address
to go from a warning to a crash.
[1]: Unable to handle kernel paging request at virtual address
000001ffa6000034
Mem abort info:
ESR = 0x0000000096000044
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x04: level 0 translation fault
Data abort info:
ISV = 0, ISS = 0x00000044, ISS2 = 0x00000000
CM = 0, WnR = 1, TnD = 0, TagAccess = 0
GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
user pgtable: 4k pages, 48-bit VAs, pgdp=000008000287c000
[000001ffa6000034] pgd=0000000000000000, p4d=0000000000000000
Call trace:
__init_zone_device_page.constprop.0+0x2c/0xa8
memmap_init_zone_device+0xf0/0x210
pagemap_range+0x1e0/0x410
memremap_pages+0x18c/0x2e0
devm_memremap_pages+0x30/0x90
kgd2kfd_init_zone_device+0xf0/0x200 [amdgpu]
amdgpu_device_ip_init+0x674/0x888 [amdgpu]
amdgpu_device_init+0x7a4/0xea0 [amdgpu]
amdgpu_driver_load_kms+0x28/0x1c0 [amdgpu]
amdgpu_pci_probe+0x1a0/0x560 [amdgpu]
local_pci_probe+0x48/0xb8
work_for_cpu_fn+0x24/0x40
process_one_work+0x170/0x3e0
worker_thread+0x2ac/0x3e0
kthread+0xf4/0x108
ret_from_fork+0x10/0x20
Fixes: 32697ff38287 ("arm64: vmemmap: Avoid base2 order of struct page size to dimension region")
Signed-off-by: D Scott Phillips <scott@...amperecomputing.com>
Cc: stable@...r.kernel.org
---
Link to v2: https://lore.kernel.org/all/20240709002757.2431399-1-scott@os.amperecomputing.com/
Changes since v1:
- Change approach again to defining the newly created PHYSMEM_END in
arch/arm64/include/asm/memory.h
Link to v1: https://lore.kernel.org/all/20240703210707.1986816-1-scott@os.amperecomputing.com/
Changes since v1:
- Change from fiddling the architecture's MAX_PHYSMEM_BITS to checking
arch_get_mappable_range().
arch/arm64/include/asm/memory.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 54fb014eba05..0480c61dbb4f 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -110,6 +110,8 @@
#define PAGE_END (_PAGE_END(VA_BITS_MIN))
#endif /* CONFIG_KASAN */
+#define PHYSMEM_END __pa(PAGE_END - 1)
+
#define MIN_THREAD_SHIFT (14 + KASAN_THREAD_SHIFT)
/*
--
2.46.0
Powered by blists - more mailing lists