[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2025072834-getaway-fling-0d66@gregkh>
Date: Mon, 28 Jul 2025 06:28:12 +0200
From: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To: Justin He <Justin.He@....com>
Cc: "Rafael J. Wysocki" <rafael@...nel.org>,
Danilo Krummrich <dakr@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] mm: percpu: Introduce normalized CPU-to-NUMA node
mapping to reduce max_distance
On Mon, Jul 28, 2025 at 02:54:42AM +0000, Justin He wrote:
> Hi Greg
>
> > -----Original Message-----
> > From: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
> > Sent: Tuesday, July 22, 2025 1:45 PM
> > To: Justin He <Justin.He@....com>
> > Cc: Rafael J. Wysocki <rafael@...nel.org>; Danilo Krummrich
> > <dakr@...nel.org>; linux-kernel@...r.kernel.org
> > Subject: Re: [PATCH] mm: percpu: Introduce normalized CPU-to-NUMA node
Odd quoting, please fix your email client :(
> > > In this configuration, pcpu_embed_first_chunk() computes a large
> > > max_distance:
> > > percpu: max_distance=0x5fffbfac0000 too large for vmalloc space
> > > 0x7bff70000000
> > >
> > > As a result, the allocator falls back to pcpu_page_first_chunk(),
> > > which uses page-by-page allocation with nr_groups = 1, leading to
> > > degraded performance.
> >
> > But that's intentional, you don't want to go across the nodes, right?
> My intention is to
Did something get dropped?
> > > This patch introduces a normalized CPU-to-NUMA node mapping to
> > > mitigate the issue. Distances of 10 and 16 are treated as local
> > > (LOCAL_DISTANCE),
> >
> > Why? What is this going to now break on those systems that assumed that
> > those were NOT local?
> The normalization only affects percpu allocations - possibly only dynamic ones.
"possibly" doesn't instill much confidence here...
> Other mechanisms, such as cpu_to_node_map, remain unaffected and continue
> to function as before in those contexts.
percpu allocations are the "hottest" path we have, so without testing
this on systems that were working well before your change, I don't think
we could ever accept this, right?
> > What did you test this on?
> >
> This was conducted on an Arm64 N2 server with 256 CPUs and 64 GB of memory.
> (Apologies, but I am not authorized to disclose the exact hardware specifications.)
That's fine, but why didn't you test this on older systems that this
code was originally written for? You don't want to have regressions on
them, right?
thanks,
greg k-h
Powered by blists - more mailing lists