[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z0dbsRCsWT3hiVds@arm.com>
Date: Wed, 27 Nov 2024 17:49:37 +0000
From: Catalin Marinas <catalin.marinas@....com>
To: Yang Shi <yang@...amperecomputing.com>
Cc: Baruch Siach <baruch@...s.co.il>, will@...nel.org, ptesarik@...e.com,
hch@....de, jiangyutang@...amperecomputing.com,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
Robin Murphy <robin.murphy@....com>
Subject: Re: [PATCH] arm64: mm: fix zone_dma_limit calculation
+ Robin
On Tue, Nov 26, 2024 at 09:38:22AM -0800, Yang Shi wrote:
> On 11/25/24 10:27 PM, Baruch Siach wrote:
> > On Mon, Nov 25 2024, Yang Shi wrote:
> > > The commit ba0fb44aed47 ("dma-mapping: replace zone_dma_bits by
> > > zone_dma_limit") changed how zone_dma_limit was calculated. Now it
> > > returns the memsize limit in IORT or device tree instead of U32_MAX if
> > > the memsize limit is greater than U32_MAX.
> >
> > Can you give a concrete example of memory layout and dma-ranges that
> > demonstrates this issue?
>
> Our 2 sockets system has physical memory starts at 0x0 on node 0 and
> 0x200000000000 on node 1. The memory size limit defined in IORT is 0x30 (48
> bits).
>
> The DMA zone is:
>
> pages free 887722
> boost 0
> min 229
> low 1108
> high 1987
> promo 2866
> spanned 983040
> present 982034
> managed 903238
> cma 16384
> protection: (0, 0, 124824, 0, 0)
> start_pfn: 65536
>
> When allocating DMA buffer, dma_direct_optimal_gfp_mask() is called to
> determine the proper zone constraints. If the phys_limit is less than
> zone_dma_limit, it will use GFP_DMA. But zone_dma_limit is 0xffffffffffff on
> v6.12 instead of 4G prior v6.12, it means all DMA buffer allocation will go
> to DMA zone even though the devices don't require it.
>
> DMA zone is on node 0, so we saw excessive remote access on 2 sockets
> system.
[...]
> The physical addr range for DMA zone is correct, the problem is wrong
> zone_dma_limit. Before commit ba0fb44aed47 zone_dma_limit was 4G, after it
> it is the whole memory even though DMA zone just covers low 4G.
Thanks for the details. I agree that zone_dma_limit shouldn't be higher
than the ZONE_DMA upper boundary, otherwise it gets confusing for
functions like dma_direct_optimal_gfp_mask() and we may force
allocations to a specific range unnecessarily.
If IORT or DT indicate a large mask covering the whole RAM (i.e. no
restrictions), in an ideal world, we should normally extend ZONE_DMA to
the same. One problem is ZONE_DMA32 (and GFP_DMA32) and the fact that
ZONE_DMA sits below it. Until we hear otherwise, we assume a DMA offset
of 0 for such 32-bit devices and therefore define ZONE_DMA32 in the
lower 4GB if RAM starts below this limit (and an empty ZONE_DMA32 if RAM
starts above).
Another aspect to consider is that we don't always have DT or IORT
information or some devices need a smaller limit than what's advertised
in the firmware tables (typically 32-bit masks). This code went through
a couple of fixes already:
833bd284a454 ("arm64: mm: fix DMA zone when dma-ranges is missing")
122c234ef4e1 ("arm64: mm: keep low RAM dma zone")
Since our current assumption is to assume ZONE_DMA within 32-bit if RAM
below 4GB, I'm happy to make this conditional on CONFIG_ZONE_DMA32 also
being enabled. So, from your patch below:
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index d21f67d67cf5..ccdef53872a0 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -117,15 +117,6 @@ static void __init arch_reserve_crashkernel(void)
> static phys_addr_t __init max_zone_phys(phys_addr_t zone_limit)
> {
> - /**
> - * Information we get from firmware (e.g. DT dma-ranges) describe DMA
> - * bus constraints. Devices using DMA might have their own limitations.
> - * Some of them rely on DMA zone in low 32-bit memory. Keep low RAM
> - * DMA zone on platforms that have RAM there.
> - */
> - if (memblock_start_of_DRAM() < U32_MAX)
> - zone_limit = min(zone_limit, U32_MAX);
> -
> return min(zone_limit, memblock_end_of_DRAM() - 1) + 1;
> }
This part is fine.
> @@ -141,6 +132,14 @@ static void __init zone_sizes_init(void)
> acpi_zone_dma_limit = acpi_iort_dma_get_max_cpu_address();
> dt_zone_dma_limit = of_dma_get_max_cpu_address(NULL);
> zone_dma_limit = min(dt_zone_dma_limit, acpi_zone_dma_limit);
> + /*
> + * Information we get from firmware (e.g. DT dma-ranges) describe DMA
> + * bus constraints. Devices using DMA might have their own limitations.
> + * Some of them rely on DMA zone in low 32-bit memory. Keep low RAM
> + * DMA zone on platforms that have RAM there.
> + */
> + if (memblock_start_of_DRAM() < U32_MAX)
> + zone_dma_limit = min(zone_dma_limit, U32_MAX);
> arm64_dma_phys_limit = max_zone_phys(zone_dma_limit);
> max_zone_pfns[ZONE_DMA] = PFN_DOWN(arm64_dma_phys_limit);
> #endif
But I'd move the zone_dma_limit update further down in the
CONFIG_ZONE_DMA32 block. I think we only need to limit it to
dma32_phys_limit and ignore the U32_MAX check. The former is already
capped to 32-bit. For the second hunk, something like below (untested):
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index d21f67d67cf5..ffaf5bd8d0a1 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -146,8 +146,10 @@ static void __init zone_sizes_init(void)
#endif
#ifdef CONFIG_ZONE_DMA32
max_zone_pfns[ZONE_DMA32] = PFN_DOWN(dma32_phys_limit);
- if (!arm64_dma_phys_limit)
+ if (!arm64_dma_phys_limit || arm64_dma_phys_limit > dma32_phys_limit) {
arm64_dma_phys_limit = dma32_phys_limit;
+ zone_dma_limit = arm64_dma_phys_limit - 1;
+ }
#endif
if (!arm64_dma_phys_limit)
arm64_dma_phys_limit = PHYS_MASK + 1;
With some comment on why we do this but most likely not the current
comment in max_zone_phys() (more like keep ZONE_DMA below ZONE_DMA32).
--
Catalin
Powered by blists - more mailing lists