lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z8cqe3BCdobsV4-2@gourry-fedora-PF4VCD3F>
Date: Tue, 4 Mar 2025 11:29:47 -0500
From: Gregory Price <gourry@...rry.net>
To: Honggyu Kim <honggyu.kim@...com>
Cc: Joshua Hahn <joshua.hahnjy@...il.com>, harry.yoo@...cle.com,
	ying.huang@...ux.alibaba.com, kernel_team@...ynix.com,
	gregkh@...uxfoundation.org, rakie.kim@...com,
	akpm@...ux-foundation.org, rafael@...nel.org, lenb@...nel.org,
	dan.j.williams@...el.com, Jonathan.Cameron@...wei.com,
	dave.jiang@...el.com, horen.chuang@...ux.dev, hannes@...xchg.org,
	linux-kernel@...r.kernel.org, linux-acpi@...r.kernel.org,
	linux-mm@...ck.org, kernel-team@...a.com, yunjeong.mun@...com
Subject: Re: [PATCH 2/2 v6] mm/mempolicy: Don't create weight sysfs for
 memoryless nodes

On Thu, Feb 27, 2025 at 11:32:26AM +0900, Honggyu Kim wrote:
> Actually, we're aware of this issue and currently trying to fix this.
> In our system, we've attached 4ch of CXL memory for each socket as
> follows.
> 
>         node0             node1
>       +-------+   UPI   +-------+
>       | CPU 0 |-+-----+-| CPU 1 |
>       +-------+         +-------+
>       | DRAM0 |         | DRAM1 |
>       +---+---+         +---+---+
>           |                 |
>       +---+---+         +---+---+
>       | CXL 0 |         | CXL 4 |
>       +---+---+         +---+---+
>       | CXL 1 |         | CXL 5 |
>       +---+---+         +---+---+
>       | CXL 2 |         | CXL 6 |
>       +---+---+         +---+---+
>       | CXL 3 |         | CXL 7 |
>       +---+---+         +---+---+
>         node2             node3
> 
> The 4ch of CXL memory are detected as a single NUMA node in each socket,
> but it shows as follows with the current N_POSSIBLE loop.
> 
> $ ls /sys/kernel/mm/mempolicy/weighted_interleave/
> node0 node1 node2 node3 node4 node5
> node6 node7 node8 node9 node10 node11

This is insufficient information for me to assess the correctness of the
configuration. Can you please show the contents of your CEDT/CFMWS and
SRAT/Memory Affinity structures?

mkdir acpi_data && cd acpi_data
acpidump -b
iasl -d *
cat cedt.dsl  <- find all CFMWS entries
cat srat.dsl  <- find all Memory Affinity entries

Basically I need to know:
1) Is each CXL device on a dedicated Host Bridge?
2) Is inter-host-bridge interleaving configured?
3) Is intra-host-bridge interleaving configured?
4) Do SRAT entries exist for all nodes?
5) Why are there 12 nodes but only 10 sources? Are there additional
   devices left out of your diagram? Are there 2 CFMWS but and 8 Memory
   Affinity records - resulting in 10 nodes? This is strange.

By default, Linux creates a node for each proximity domain ("PXM")
detected in the SRAT Memory Affinity tables. If SRAT entries for a
memory region described in a CFMWS is absent, it will also create an
node for that CFMWS.

Your reported configuration and results lead me to believe you have
a combination of CFMWS/SRAT configurations that are unexpected.

~Gregory

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ