[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZyrBcDvFVtSFPhvG@gpd3>
Date: Wed, 6 Nov 2024 02:08:00 +0100
From: Andrea Righi <arighi@...dia.com>
To: Tejun Heo <tj@...nel.org>
Cc: David Vernet <void@...ifault.com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH sched_ext/for-6.13] sched_ext: Do not enable LLC/NUMA
optimizations when domains overlap
On Tue, Nov 05, 2024 at 02:33:53PM -1000, Tejun Heo wrote:
> Hello,
>
> On Wed, Nov 06, 2024 at 01:29:08AM +0100, Andrea Righi wrote:
> ...
> > Let's say we have 2 NUMA nodes, each with 2 sockets, and each socket
> > has its own L3 cache. In this case, numa_cpus will be larger than
> > llc_cpus, and enabling both NUMA and LLC optimizations would be
> > beneficial.
> >
> > On the other hand, if each NUMA node contains only 1 socket, numa_cpus
> > and llc_cpus will overlap completely, making it unnecessary to enable
> > both NUMA and LLC optimizations, so we can have just the LLC in this
> > case.
> >
> > Would something like this help clarifying the first test?
>
> I was more thinking about the theoretical case where one socket has one LLC
> while a different socket has multiple LLCs. I don't think there are any
> systems which are actually like that but there's nothing in the code which
> prevents that (unlike a single CPU belonging to multiple domains), so it'd
> probably be worthwhile to explain why the abbreviated test is enough.
In theory a CPU can only belong to a single domain (otherwise other
stuff in topology.c are broken as well), but potentially we could have
something like:
NUMA 1
- CPU 1 (L3)
NUMA 2
- CPU 2 (L3)
- CPU 3 (L3)
If we inspect CPU 1 only we may incorrectly assume that numa_cpus ==
llc_cpus. To handle this properly we may have to inspect
all the CPUs, instead of just the first one.
Moreover, with qemu we can also simulate ugly topologies like 2 NUMA
nodes and 1 L3 cache that covers the 2 NUMA nodes:
arighi@...3~/s/linux (master)> vng --cpu 4 -m 4G --numa 2G,cpus=0-1 --numa 2G,cpus=2-3
...
arighi@...tme-ng~/s/linux (master)> lscpu -e
CPU NODE SOCKET CORE L1d:L1i:L2:L3 ONLINE
0 0 0 0 0:0:0:0 yes
1 0 0 1 1:1:1:0 yes
2 1 0 2 2:2:2:0 yes
3 1 0 3 3:3:3:0 yes
arighi@...tme-ng~/s/linux (master)> numactl -H
available: 2 nodes (0-1)
node 0 cpus: 0 1
node 0 size: 2014 MB
node 0 free: 1931 MB
node 1 cpus: 2 3
node 1 size: 1896 MB
node 1 free: 1847 MB
node distances:
node 0 1
0: 10 20
1: 20 10
I think this is only possible in a virtualized environment, in this case
LLC should be disabled and NUMA enabled. Maybe it's worth checking also
for the case where LLC > NUMA...
-Andrea
Powered by blists - more mailing lists