[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZtdBTHzaeK4JNxvz@smile.fi.intel.com>
Date: Tue, 3 Sep 2024 20:03:08 +0300
From: Andy Shevchenko <andriy.shevchenko@...ux.intel.com>
To: D Scott Phillips <scott@...amperecomputing.com>
Cc: Catalin M arinas <catalin.marinas@....com>,
Mark Rutland <mark.rutland@....com>, Will Deacon <will@...nel.org>,
Ard Biesheuvel <ardb@...nel.org>,
linux-arm-kernel@...ts.infradead.org,
"Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>,
AKASHI Takahiro <takahiro.akashi@...aro.org>,
Alison Schofield <alison.schofield@...el.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Andrey Konovalov <andreyknvl@...il.com>,
Ankit Agrawal <ankita@...dia.com>, Baoquan He <bhe@...hat.com>,
Dan Williams <dan.j.williams@...el.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Wang Jinchao <wangjinchao@...sion.com>,
linux-kernel@...r.kernel.org, patches@...erecomputing.com
Subject: Re: [PATCH v3] arm64: Expose the end of the linear map in PHYSMEM_END
On Tue, Sep 03, 2024 at 09:45:32AM -0700, D Scott Phillips wrote:
> The memory hot-plug and resource management code needs to know the
> largest address which can fit in the linear map, so set
> PHYSMEM_END for that purpose.
>
> This fixes a crash[1] at boot when amdgpu tries to create
> DEVICE_PRIVATE_MEMORY and is given a physical address by the
> resource management code which is outside the range which can have
> a `struct page`
>
> The Fixes: commit listed below isn't actually broken, but the
> reorganization of vmemmap causes the improper DEVICE_PRIVATE_MEMORY address
> to go from a warning to a crash.
>
> [1]: Unable to handle kernel paging request at virtual address
No need to have [1]: prefix here and also read this
https://www.kernel.org/doc/html/latest/process/submitting-patches.html#backtraces-in-commit-messages
and amend commit message accordingly.
> 000001ffa6000034
> Mem abort info:
> ESR = 0x0000000096000044
> EC = 0x25: DABT (current EL), IL = 32 bits
> SET = 0, FnV = 0
> EA = 0, S1PTW = 0
> FSC = 0x04: level 0 translation fault
> Data abort info:
> ISV = 0, ISS = 0x00000044, ISS2 = 0x00000000
> CM = 0, WnR = 1, TnD = 0, TagAccess = 0
> GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> user pgtable: 4k pages, 48-bit VAs, pgdp=000008000287c000
> [000001ffa6000034] pgd=0000000000000000, p4d=0000000000000000
> Call trace:
> __init_zone_device_page.constprop.0+0x2c/0xa8
> memmap_init_zone_device+0xf0/0x210
> pagemap_range+0x1e0/0x410
> memremap_pages+0x18c/0x2e0
> devm_memremap_pages+0x30/0x90
> kgd2kfd_init_zone_device+0xf0/0x200 [amdgpu]
> amdgpu_device_ip_init+0x674/0x888 [amdgpu]
> amdgpu_device_init+0x7a4/0xea0 [amdgpu]
> amdgpu_driver_load_kms+0x28/0x1c0 [amdgpu]
> amdgpu_pci_probe+0x1a0/0x560 [amdgpu]
> local_pci_probe+0x48/0xb8
> work_for_cpu_fn+0x24/0x40
> process_one_work+0x170/0x3e0
> worker_thread+0x2ac/0x3e0
> kthread+0xf4/0x108
> ret_from_fork+0x10/0x20
--
With Best Regards,
Andy Shevchenko
Powered by blists - more mailing lists