lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <c43c64f4-e697-468e-80af-87bd02a95ba2@sk.com>
Date: Thu, 27 Feb 2025 12:20:03 +0900
From: Honggyu Kim <honggyu.kim@...com>
To: Joshua Hahn <joshua.hahnjy@...il.com>, gourry@...rry.net,
 harry.yoo@...cle.com, ying.huang@...ux.alibaba.com
Cc: kernel_team@...ynix.com, gregkh@...uxfoundation.org, rakie.kim@...com,
 akpm@...ux-foundation.org, rafael@...nel.org, lenb@...nel.org,
 dan.j.williams@...el.com, Jonathan.Cameron@...wei.com, dave.jiang@...el.com,
 horen.chuang@...ux.dev, hannes@...xchg.org, linux-kernel@...r.kernel.org,
 linux-acpi@...r.kernel.org, linux-mm@...ck.org, kernel-team@...a.com,
 yunjeong.mun@...com
Subject: Re: [PATCH 2/2 v6] mm/mempolicy: Don't create weight sysfs for
 memoryless nodes



On 2/27/2025 11:32 AM, Honggyu Kim wrote:
> Hi Joshua,
> 
> On 2/27/2025 6:35 AM, Joshua Hahn wrote:
>> We should never try to allocate memory from a memoryless node. Creating a
>> sysfs knob to control its weighted interleave weight does not make sense,
>> and can be unsafe.
>>
>> Only create weighted interleave weight knobs for nodes with memory.
>>
>> Signed-off-by: Joshua Hahn <joshua.hahnjy@...il.com>
>> ---
>>   mm/mempolicy.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
>> index 4cc04ff8f12c..50cbb7c047fa 100644
>> --- a/mm/mempolicy.c
>> +++ b/mm/mempolicy.c
>> @@ -3721,7 +3721,7 @@ static int add_weighted_interleave_group(struct 
>> kobject *root_kobj)
>>           return err;
>>       }
>> -    for_each_node_state(nid, N_POSSIBLE) {
> 
> Actually, we're aware of this issue and currently trying to fix this.
> In our system, we've attached 4ch of CXL memory for each socket as
> follows.
> 
>          node0             node1
>        +-------+   UPI   +-------+
>        | CPU 0 |-+-----+-| CPU 1 |
>        +-------+         +-------+
>        | DRAM0 |         | DRAM1 |
>        +---+---+         +---+---+
>            |                 |
>        +---+---+         +---+---+
>        | CXL 0 |         | CXL 4 |
>        +---+---+         +---+---+
>        | CXL 1 |         | CXL 5 |
>        +---+---+         +---+---+
>        | CXL 2 |         | CXL 6 |
>        +---+---+         +---+---+
>        | CXL 3 |         | CXL 7 |
>        +---+---+         +---+---+
>          node2             node3
> 
> The 4ch of CXL memory are detected as a single NUMA node in each socket,
> but it shows as follows with the current N_POSSIBLE loop.
> 
> $ ls /sys/kernel/mm/mempolicy/weighted_interleave/
> node0 node1 node2 node3 node4 node5
> node6 node7 node8 node9 node10 node11
> 
>> +    for_each_node_state(nid, N_MEMORY) {

Thinking it again, we can leave it as a separate patch but add our patch 
on top of it.

The only concern I have is having only N_MEMORY patch hides weight
setting knobs for CXL memory and it makes there is no way to set weight 
values to CXL memory in my system.

IMHO, this and our patch is better to be submitted together.

Thanks,
Honggyu

> 
> But using N_MEMORY doesn't fix this problem and it hides the entire CXL
> memory nodes in our system because the CXL memory isn't detected at this
> point of creating node*.  Maybe there is some difference when multiple
> CXL memory is detected as a single node.
> 
> We have to create more nodes when CXL memory is detected later.  In 
> addition, this part can be changed to "for_each_online_node(nid)"
> although N_MEMORY is also fine here.
> 
> We've internally fixed it using a memory hotpluging callback so we can
> upload another working version later.
> 
> Do you mind if we continue fixing this work?
> 
> Thanks,
> Honggyu
> 
>>           err = add_weight_node(nid, wi_kobj);
>>           if (err) {
>>               pr_err("failed to add sysfs [node%d]\n", nid);
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ