[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zx_6LNyA97_MeBIB@PC2K9PVX.TheFacebook.com>
Date: Mon, 28 Oct 2024 16:55:08 -0400
From: Gregory Price <gourry@...rry.net>
To: Mike Rapoport <rppt@...nel.org>
Cc: x86@...nel.org, linux-kernel@...r.kernel.org,
linux-acpi@...r.kernel.org, linux-mm@...ck.org, linux-cxl@...ck.org,
Jonathan.Cameron@...wei.com, dan.j.williams@...el.com,
rrichter@....com, Terry.Bowman@....com, dave.jiang@...el.com,
ira.weiny@...el.com, alison.schofield@...el.com,
dave.hansen@...ux.intel.com, luto@...nel.org, peterz@...radead.org,
tglx@...utronix.de, mingo@...hat.com, bp@...en8.de, hpa@...or.com,
rafael@...nel.org, lenb@...nel.org, david@...hat.com,
osalvador@...e.de, gregkh@...uxfoundation.org,
akpm@...ux-foundation.org
Subject: Re: [PATCH v3 3/3] acpi,srat: give memory block size advice based on
CFMWS alignment
On Mon, Oct 28, 2024 at 07:24:54PM +0200, Mike Rapoport wrote:
> On Tue, Oct 22, 2024 at 05:34:50PM -0400, Gregory Price wrote:
> > Capacity is stranded when CFMWS regions are not aligned to block size.
> > On x86, block size increases with capacity (2G blocks @ 64G capacity).
> >
> > Use CFMWS base/size to report memory block size alignment advice.
> >
> > After the alignment, the acpi code begins populating numa nodes with
> > memblocks, so probe the value just prior to lock it in. All future
> > callers should be providing advice prior to this point.
> >
> > Suggested-by: Dan Williams <dan.j.williams@...el.com>
> > Signed-off-by: Gregory Price <gourry@...rry.net>
> > ---
> > drivers/acpi/numa/srat.c | 33 +++++++++++++++++++++++++++++++++
> > 1 file changed, 33 insertions(+)
> >
... snip ...
> > + /* Align memblock size to CFMW regions if possible */
> > + acpi_table_parse_cedt(ACPI_CEDT_TYPE_CFMWS, acpi_align_cfmws, NULL);
> > +
> > + /*
> > + * Nodes start populating with blocks after this, so probe the max
> > + * block size to prevent it from changing in the future.
> > + */
> > + memory_block_probe_max_size();
> > +
>
> It won't change, but how drivers/base/memory.c will know about the probed
> size if architecture does not override memory_block_size_bytes()?
>
non-arch code should be calling memory_block_size_bytes() to discover
the actual size of blocks - and for archs that care about this value,
that is when it should be probed. It's up to the arch whether/how to use
this information. Many archs ignore it entirely and use MIN_BLOCK_SIZE.
basically non-arch code shouldn't care what this value is, and even most
arch code shouldn't care.
I added this call to probe to lock in the size since I saw that nodes
will start populating blocks immediately after this.
Possibly the APIs should be marked __init so that the whole interface
disappears after init to avoid misuse post-init.
Possibly probe() should return -EBUSY if called more than once to
enforce a particular probe pattern on the architectures?
Open to thoughts here.
> > /* fake_pxm is the next unused PXM value after SRAT parsing */
> > for (i = 0, fake_pxm = -1; i < MAX_NUMNODES; i++) {
> > if (node_to_pxm_map[i] > fake_pxm)
> > --
> > 2.43.0
> >
>
> --
> Sincerely yours,
> Mike.
Powered by blists - more mailing lists