[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMuHMdUXj4QiSKdhf61xdDeu94=Hv0BXuCxykDpQwdY81_h2vw@mail.gmail.com>
Date: Tue, 29 Oct 2024 13:41:31 +0100
From: Geert Uytterhoeven <geert@...ux-m68k.org>
To: Oreoluwa Babatunde <quic_obabatun@...cinc.com>
Cc: robh@...nel.org, aisheng.dong@....com, andy@...ck.fi.intel.com,
catalin.marinas@....com, devicetree@...r.kernel.org, hch@....de,
iommu@...ts.linux.dev, kernel@...cinc.com, klarasmodin@...il.com,
linux-kernel@...r.kernel.org, m.szyprowski@...sung.com,
quic_ninanaik@...cinc.com, robin.murphy@....com, saravanak@...gle.com,
will@...nel.org
Subject: Re: [PATCH v10 1/2] of: reserved_mem: Restruture how the reserved
memory regions are processed
Hi Oreoluwa,
On Wed, Oct 9, 2024 at 12:08 AM Oreoluwa Babatunde
<quic_obabatun@...cinc.com> wrote:
> Reserved memory regions defined in the devicetree can be broken up into
> two groups:
> i) Statically-placed reserved memory regions
> i.e. regions defined with a static start address and size using the
> "reg" property.
> ii) Dynamically-placed reserved memory regions.
> i.e. regions defined by specifying an address range where they can be
> placed in memory using the "alloc_ranges" and "size" properties.
>
> These regions are processed and set aside at boot time.
> This is done in two stages as seen below:
>
> Stage 1:
> At this stage, fdt_scan_reserved_mem() scans through the child nodes of
> the reserved_memory node using the flattened devicetree and does the
> following:
>
> 1) If the node represents a statically-placed reserved memory region,
> i.e. if it is defined using the "reg" property:
> - Call memblock_reserve() or memblock_mark_nomap() as needed.
> - Add the information for that region into the reserved_mem array
> using fdt_reserved_mem_save_node().
> i.e. fdt_reserved_mem_save_node(node, name, base, size).
>
> 2) If the node represents a dynamically-placed reserved memory region,
> i.e. if it is defined using "alloc-ranges" and "size" properties:
> - Add the information for that region to the reserved_mem array with
> the starting address and size set to 0.
> i.e. fdt_reserved_mem_save_node(node, name, 0, 0).
> Note: This region is saved to the array with a starting address of 0
> because a starting address is not yet allocated for it.
>
> Stage 2:
> After iterating through all the reserved memory nodes and storing their
> relevant information in the reserved_mem array,fdt_init_reserved_mem() is
> called and does the following:
>
> 1) For statically-placed reserved memory regions:
> - Call the region specific init function using
> __reserved_mem_init_node().
> 2) For dynamically-placed reserved memory regions:
> - Call __reserved_mem_alloc_size() which is used to allocate memory
> for each of these regions, and mark them as nomap if they have the
> nomap property specified in the DT.
> - Call the region specific init function.
>
> The current size of the resvered_mem array is 64 as is defined by
> MAX_RESERVED_REGIONS. This means that there is a limitation of 64 for
> how many reserved memory regions can be specified on a system.
> As systems continue to grow more and more complex, the number of
> reserved memory regions needed are also growing and are starting to hit
> this 64 count limit, hence the need to make the reserved_mem array
> dynamically sized (i.e. dynamically allocating memory for the
> reserved_mem array using membock_alloc_*).
>
> On architectures such as arm64, memory allocated using memblock is
> writable only after the page tables have been setup. This means that if
> the reserved_mem array is going to be dynamically allocated, it needs to
> happen after the page tables have been setup, not before.
>
> Since the reserved memory regions are currently being processed and
> added to the array before the page tables are setup, there is a need to
> change the order in which some of the processing is done to allow for
> the reserved_mem array to be dynamically sized.
>
> It is possible to process the statically-placed reserved memory regions
> without needing to store them in the reserved_mem array until after the
> page tables have been setup because all the information stored in the
> array is readily available in the devicetree and can be referenced at
> any time.
> Dynamically-placed reserved memory regions on the other hand get
> assigned a start address only at runtime, and hence need a place to be
> stored once they are allocated since there is no other referrence to the
> start address for these regions.
>
> Hence this patch changes the processing order of the reserved memory
> regions in the following ways:
>
> Step 1:
> fdt_scan_reserved_mem() scans through the child nodes of
> the reserved_memory node using the flattened devicetree and does the
> following:
>
> 1) If the node represents a statically-placed reserved memory region,
> i.e. if it is defined using the "reg" property:
> - Call memblock_reserve() or memblock_mark_nomap() as needed.
>
> 2) If the node represents a dynamically-placed reserved memory region,
> i.e. if it is defined using "alloc-ranges" and "size" properties:
> - Call __reserved_mem_alloc_size() which will:
> i) Allocate memory for the reserved region and call
> memblock_mark_nomap() as needed.
> ii) Call the region specific initialization function using
> fdt_init_reserved_mem_node().
> iii) Save the region information in the reserved_mem array using
> fdt_reserved_mem_save_node().
>
> Step 2:
> 1) This stage of the reserved memory processing is now only used to add
> the statically-placed reserved memory regions into the reserved_mem
> array using fdt_scan_reserved_mem_reg_nodes(), as well as call their
> region specific initialization functions.
>
> 2) This step has also been moved to be after the page tables are
> setup. Moving this will allow us to replace the reserved_mem
> array with a dynamically sized array before storing the rest of
> these regions.
>
> Signed-off-by: Oreoluwa Babatunde <quic_obabatun@...cinc.com>
Thanks for your patch, which is now commit 8a6e02d0c00e7b62
("of: reserved_mem: Restructure how the reserved memory regions
are processed") in dt-rh/for-next.
I have bisected a boot issue on RZ/Five to this commit.
With "earlycon keep_bootcon" (else there is no output):
Oops - store (or AMO) access fault [#1]
CPU: 0 UID: 0 PID: 1 Comm: swapper Not tainted
6.12.0-rc1-00015-g8a6e02d0c00e #201
Hardware name: Renesas SMARC EVK based on r9a07g043f01 (DT)
epc : __memset+0x60/0x100
ra : __dma_alloc_from_coherent+0x150/0x17a
epc : ffffffff8062d2bc ra : ffffffff80053a94 sp : ffffffc60000ba20
gp : ffffffff812e9938 tp : ffffffd601920000 t0 : ffffffc6000d0000
t1 : 0000000000000000 t2 : ffffffffe9600000 s0 : ffffffc60000baa0
s1 : ffffffc6000d0000 a0 : ffffffc6000d0000 a1 : 0000000000000000
a2 : 0000000000001000 a3 : ffffffc6000d1000 a4 : 0000000000000000
a5 : 0000000000000000 a6 : ffffffd601adacc0 a7 : ffffffd601a841a8
s2 : ffffffd6018573c0 s3 : 0000000000001000 s4 : ffffffd6019541e0
s5 : 0000000200000022 s6 : ffffffd6018f8410 s7 : ffffffd6018573e8
s8 : 0000000000000001 s9 : 0000000000000001 s10: 0000000000000010
s11: 0000000000000000 t3 : 0000000000000000 t4 : ffffffffdefe62d1
t5 : 000000001cd6a3a9 t6 : ffffffd601b2aad6
status: 0000000200000120 badaddr: ffffffc6000d0000 cause: 0000000000000007
[<ffffffff8062d2bc>] __memset+0x60/0x100
[<ffffffff80053e1a>] dma_alloc_from_global_coherent+0x1c/0x28
[<ffffffff80053056>] dma_direct_alloc+0x98/0x112
[<ffffffff8005238c>] dma_alloc_attrs+0x78/0x86
[<ffffffff8035fdb4>] rz_dmac_probe+0x3f6/0x50a
[<ffffffff803a0694>] platform_probe+0x4c/0x8a
[<ffffffff8039ea16>] really_probe+0xe4/0x1c8
[<ffffffff8039ebc4>] __driver_probe_device+0xca/0xce
[<ffffffff8039ec48>] driver_probe_device+0x34/0x92
[<ffffffff8039ede8>] __driver_attach+0xb4/0xbe
[<ffffffff8039ce58>] bus_for_each_dev+0x60/0xa0
[<ffffffff8039e26a>] driver_attach+0x1a/0x22
[<ffffffff8039dc20>] bus_add_driver+0xa4/0x184
[<ffffffff8039f65c>] driver_register+0x8a/0xb4
[<ffffffff803a051c>] __platform_driver_register+0x1c/0x24
[<ffffffff808202f6>] rz_dmac_driver_init+0x1a/0x22
[<ffffffff80800ef6>] do_one_initcall+0x64/0x134
[<ffffffff8080122e>] kernel_init_freeable+0x200/0x202
[<ffffffff80638126>] kernel_init+0x1e/0x10a
[<ffffffff8063d58e>] ret_from_fork+0xe/0x18
Code: 1007 82b3 40e2 0797 0000 8793 00e7 8305 97ba 8782 (b023) 00b2
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Fatal exception in interrupt
---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
Nothing really stands out in the kernel log, except for a delayed
initialization of the reserved mem nodes (they are the same
before/after):
printk: debug: ignoring loglevel setting.
-OF: reserved mem: 0x0000000000030000..0x000000000003ffff (64 KiB)
nomap non-reusable mmode_resv0@...00
-OF: reserved mem: 0x0000000000040000..0x000000000004ffff (64 KiB)
nomap non-reusable mmode_resv1@...00
-OF: reserved mem: 0x0000000044000000..0x000000004403ffff (256 KiB)
nomap non-reusable mmode_resv3@...00000
-OF: reserved mem: 0x0000000044040000..0x000000004405ffff (128 KiB)
nomap non-reusable mmode_resv2@...40000
+earlycon: scif0 at MMIO 0x000000001004b800 (options '115200n8')
+printk: legacy bootconsole [scif0] enabled
+printk: debug: skip boot console de-registration.
Reserved memory: created DMA memory pool at 0x0000000058000000, size 128 MiB
OF: reserved mem: initialized node pma_resv0@...00000, compatible id
shared-dma-pool
OF: reserved mem: 0x0000000058000000..0x000000005fffffff (131072 KiB)
nomap non-reusable pma_resv0@...00000
+OF: reserved mem: 0x0000000000030000..0x000000000003ffff (64 KiB)
nomap non-reusable mmode_resv0@...00
+OF: reserved mem: 0x0000000000040000..0x000000000004ffff (64 KiB)
nomap non-reusable mmode_resv1@...00
+OF: reserved mem: 0x0000000044040000..0x000000004405ffff (128 KiB)
nomap non-reusable mmode_resv2@...40000
+OF: reserved mem: 0x0000000044000000..0x000000004403ffff (256 KiB)
nomap non-reusable mmode_resv3@...00000
Zone ranges:
DMA32 [mem 0x0000000048000000-0x000000007fffffff]
Normal empty
Reverting commits 00c9a452a235c61f ("of: reserved_mem: Add code to
dynamically allocate reserved_mem array") and 8a6e02d0c00e7b62 fixes
the issue.
root@...rc-rzfive:/sys/firmware/devicetree/base/reserved-memory# ls -l
total 0
-r--r--r-- 1 root root 4 Oct 29 12:37 #address-cells
-r--r--r-- 1 root root 4 Oct 29 12:37 #size-cells
drwxr-xr-x 2 root root 0 Oct 29 12:37 mmode_resv0@...00
drwxr-xr-x 2 root root 0 Oct 29 12:37 mmode_resv1@...00
drwxr-xr-x 2 root root 0 Oct 29 12:37 mmode_resv2@...40000
drwxr-xr-x 2 root root 0 Oct 29 12:37 mmode_resv3@...00000
-r--r--r-- 1 root root 16 Oct 29 12:37 name
drwxr-xr-x 2 root root 0 Oct 29 12:37 pma_resv0@...00000
-r--r--r-- 1 root root 0 Oct 29 12:37 ranges
> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
> index 4d528c10df3a..d0dbc8183ac4 100644
> --- a/drivers/of/fdt.c
> +++ b/drivers/of/fdt.c
> @@ -511,8 +511,6 @@ void __init early_init_fdt_scan_reserved_mem(void)
> break;
> memblock_reserve(base, size);
> }
> -
> - fdt_init_reserved_mem();
> }
>
> /**
> @@ -1212,6 +1210,9 @@ void __init unflatten_device_tree(void)
> {
> void *fdt = initial_boot_params;
>
> + /* Save the statically-placed regions in the reserved_mem array */
> + fdt_scan_reserved_mem_reg_nodes();
> +
> /* Don't use the bootloader provided DTB if ACPI is enabled */
> if (!acpi_disabled)
> fdt = NULL;
> diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h
> index 04aa2a91f851..29525c0b9939 100644
> --- a/drivers/of/of_private.h
> +++ b/drivers/of/of_private.h
> @@ -9,6 +9,7 @@
> */
>
> #define FDT_ALIGN_SIZE 8
> +#define MAX_RESERVED_REGIONS 64
>
> /**
> * struct alias_prop - Alias property in 'aliases' node
> @@ -180,7 +181,7 @@ static inline struct device_node *__of_get_dma_parent(const struct device_node *
> #endif
>
> int fdt_scan_reserved_mem(void);
> -void fdt_init_reserved_mem(void);
> +void __init fdt_scan_reserved_mem_reg_nodes(void);
>
> bool of_fdt_device_is_available(const void *blob, unsigned long node);
>
> diff --git a/drivers/of/of_reserved_mem.c b/drivers/of/of_reserved_mem.c
> index 46e1c3fbc769..2011174211f9 100644
> --- a/drivers/of/of_reserved_mem.c
> +++ b/drivers/of/of_reserved_mem.c
> @@ -27,7 +27,6 @@
>
> #include "of_private.h"
>
> -#define MAX_RESERVED_REGIONS 64
> static struct reserved_mem reserved_mem[MAX_RESERVED_REGIONS];
> static int reserved_mem_count;
>
> @@ -56,6 +55,7 @@ static int __init early_init_dt_alloc_reserved_memory_arch(phys_addr_t size,
> return err;
> }
>
> +static void __init fdt_init_reserved_mem_node(struct reserved_mem *rmem);
> /*
> * fdt_reserved_mem_save_node() - save fdt node for second pass initialization
> */
> @@ -74,6 +74,9 @@ static void __init fdt_reserved_mem_save_node(unsigned long node, const char *un
> rmem->base = base;
> rmem->size = size;
>
> + /* Call the region specific initialization function */
> + fdt_init_reserved_mem_node(rmem);
> +
> reserved_mem_count++;
> return;
> }
> @@ -106,7 +109,6 @@ static int __init __reserved_mem_reserve_reg(unsigned long node,
> phys_addr_t base, size;
> int len;
> const __be32 *prop;
> - int first = 1;
> bool nomap;
>
> prop = of_get_flat_dt_prop(node, "reg", &len);
> @@ -134,10 +136,6 @@ static int __init __reserved_mem_reserve_reg(unsigned long node,
> uname, &base, (unsigned long)(size / SZ_1M));
>
> len -= t_len;
> - if (first) {
> - fdt_reserved_mem_save_node(node, uname, base, size);
> - first = 0;
> - }
> }
> return 0;
> }
> @@ -165,12 +163,77 @@ static int __init __reserved_mem_check_root(unsigned long node)
> return 0;
> }
>
> +static void __init __rmem_check_for_overlap(void);
> +
> +/**
> + * fdt_scan_reserved_mem_reg_nodes() - Store info for the "reg" defined
> + * reserved memory regions.
> + *
> + * This function is used to scan through the DT and store the
> + * information for the reserved memory regions that are defined using
> + * the "reg" property. The region node number, name, base address, and
> + * size are all stored in the reserved_mem array by calling the
> + * fdt_reserved_mem_save_node() function.
> + */
> +void __init fdt_scan_reserved_mem_reg_nodes(void)
> +{
> + int t_len = (dt_root_addr_cells + dt_root_size_cells) * sizeof(__be32);
> + const void *fdt = initial_boot_params;
> + phys_addr_t base, size;
> + const __be32 *prop;
> + int node, child;
> + int len;
> +
> + if (!fdt)
> + return;
> +
> + node = fdt_path_offset(fdt, "/reserved-memory");
> + if (node < 0) {
> + pr_info("Reserved memory: No reserved-memory node in the DT\n");
> + return;
> + }
> +
> + if (__reserved_mem_check_root(node)) {
> + pr_err("Reserved memory: unsupported node format, ignoring\n");
> + return;
> + }
> +
> + fdt_for_each_subnode(child, fdt, node) {
> + const char *uname;
> +
> + prop = of_get_flat_dt_prop(child, "reg", &len);
> + if (!prop)
> + continue;
> + if (!of_fdt_device_is_available(fdt, child))
> + continue;
> +
> + uname = fdt_get_name(fdt, child, NULL);
> + if (len && len % t_len != 0) {
> + pr_err("Reserved memory: invalid reg property in '%s', skipping node.\n",
> + uname);
> + continue;
> + }
> + base = dt_mem_next_cell(dt_root_addr_cells, &prop);
> + size = dt_mem_next_cell(dt_root_size_cells, &prop);
> +
> + if (size)
> + fdt_reserved_mem_save_node(child, uname, base, size);
> + }
> +
> + /* check for overlapping reserved regions */
> + __rmem_check_for_overlap();
> +}
> +
> +static int __init __reserved_mem_alloc_size(unsigned long node, const char *uname);
> +
> /*
> * fdt_scan_reserved_mem() - scan a single FDT node for reserved memory
> */
> int __init fdt_scan_reserved_mem(void)
> {
> int node, child;
> + int dynamic_nodes_cnt = 0;
> + int dynamic_nodes[MAX_RESERVED_REGIONS];
> const void *fdt = initial_boot_params;
>
> node = fdt_path_offset(fdt, "/reserved-memory");
> @@ -192,8 +255,24 @@ int __init fdt_scan_reserved_mem(void)
> uname = fdt_get_name(fdt, child, NULL);
>
> err = __reserved_mem_reserve_reg(child, uname);
> - if (err == -ENOENT && of_get_flat_dt_prop(child, "size", NULL))
> - fdt_reserved_mem_save_node(child, uname, 0, 0);
> + /*
> + * Save the nodes for the dynamically-placed regions
> + * into an array which will be used for allocation right
> + * after all the statically-placed regions are reserved
> + * or marked as no-map. This is done to avoid dynamically
> + * allocating from one of the statically-placed regions.
> + */
> + if (err == -ENOENT && of_get_flat_dt_prop(child, "size", NULL)) {
> + dynamic_nodes[dynamic_nodes_cnt] = child;
> + dynamic_nodes_cnt++;
> + }
> + }
> + for (int i = 0; i < dynamic_nodes_cnt; i++) {
> + const char *uname;
> +
> + child = dynamic_nodes[i];
> + uname = fdt_get_name(fdt, child, NULL);
> + __reserved_mem_alloc_size(child, uname);
> }
> return 0;
> }
> @@ -253,8 +332,7 @@ static int __init __reserved_mem_alloc_in_range(phys_addr_t size,
> * __reserved_mem_alloc_size() - allocate reserved memory described by
> * 'size', 'alignment' and 'alloc-ranges' properties.
> */
> -static int __init __reserved_mem_alloc_size(unsigned long node,
> - const char *uname, phys_addr_t *res_base, phys_addr_t *res_size)
> +static int __init __reserved_mem_alloc_size(unsigned long node, const char *uname)
> {
> int t_len = (dt_root_addr_cells + dt_root_size_cells) * sizeof(__be32);
> phys_addr_t start = 0, end = 0;
> @@ -334,9 +412,8 @@ static int __init __reserved_mem_alloc_size(unsigned long node,
> return -ENOMEM;
> }
>
> - *res_base = base;
> - *res_size = size;
> -
> + /* Save region in the reserved_mem array */
> + fdt_reserved_mem_save_node(node, uname, base, size);
> return 0;
> }
>
> @@ -425,48 +502,37 @@ static void __init __rmem_check_for_overlap(void)
> }
>
> /**
> - * fdt_init_reserved_mem() - allocate and init all saved reserved memory regions
> + * fdt_init_reserved_mem_node() - Initialize a reserved memory region
> + * @rmem: reserved_mem struct of the memory region to be initialized.
> + *
> + * This function is used to call the region specific initialization
> + * function for a reserved memory region.
> */
> -void __init fdt_init_reserved_mem(void)
> +static void __init fdt_init_reserved_mem_node(struct reserved_mem *rmem)
> {
> - int i;
> -
> - /* check for overlapping reserved regions */
> - __rmem_check_for_overlap();
> -
> - for (i = 0; i < reserved_mem_count; i++) {
> - struct reserved_mem *rmem = &reserved_mem[i];
> - unsigned long node = rmem->fdt_node;
> - int err = 0;
> - bool nomap;
> + unsigned long node = rmem->fdt_node;
> + int err = 0;
> + bool nomap;
>
> - nomap = of_get_flat_dt_prop(node, "no-map", NULL) != NULL;
> + nomap = of_get_flat_dt_prop(node, "no-map", NULL) != NULL;
>
> - if (rmem->size == 0)
> - err = __reserved_mem_alloc_size(node, rmem->name,
> - &rmem->base, &rmem->size);
> - if (err == 0) {
> - err = __reserved_mem_init_node(rmem);
> - if (err != 0 && err != -ENOENT) {
> - pr_info("node %s compatible matching fail\n",
> - rmem->name);
> - if (nomap)
> - memblock_clear_nomap(rmem->base, rmem->size);
> - else
> - memblock_phys_free(rmem->base,
> - rmem->size);
> - } else {
> - phys_addr_t end = rmem->base + rmem->size - 1;
> - bool reusable =
> - (of_get_flat_dt_prop(node, "reusable", NULL)) != NULL;
> -
> - pr_info("%pa..%pa (%lu KiB) %s %s %s\n",
> - &rmem->base, &end, (unsigned long)(rmem->size / SZ_1K),
> - nomap ? "nomap" : "map",
> - reusable ? "reusable" : "non-reusable",
> - rmem->name ? rmem->name : "unknown");
> - }
> - }
> + err = __reserved_mem_init_node(rmem);
> + if (err != 0 && err != -ENOENT) {
> + pr_info("node %s compatible matching fail\n", rmem->name);
> + if (nomap)
> + memblock_clear_nomap(rmem->base, rmem->size);
> + else
> + memblock_phys_free(rmem->base, rmem->size);
> + } else {
> + phys_addr_t end = rmem->base + rmem->size - 1;
> + bool reusable =
> + (of_get_flat_dt_prop(node, "reusable", NULL)) != NULL;
> +
> + pr_info("%pa..%pa (%lu KiB) %s %s %s\n",
> + &rmem->base, &end, (unsigned long)(rmem->size / SZ_1K),
> + nomap ? "nomap" : "map",
> + reusable ? "reusable" : "non-reusable",
> + rmem->name ? rmem->name : "unknown");
> }
> }
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@...ux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
Powered by blists - more mailing lists