lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e29293fd-b81a-4133-800a-875dcd252d67@redhat.com>
Date: Fri, 14 Mar 2025 10:17:26 +0100
From: David Hildenbrand <david@...hat.com>
To: Rakie Kim <rakie.kim@...com>
Cc: akpm@...ux-foundation.org, linux-mm@...ck.org,
 linux-kernel@...r.kernel.org, linux-cxl@...r.kernel.org,
 joshua.hahnjy@...il.com, dan.j.williams@...el.com,
 ying.huang@...ux.alibaba.com, kernel_team@...ynix.com, honggyu.kim@...com,
 yunjeong.mun@...com, Gregory Price <gourry@...rry.net>
Subject: Re: [PATCH v2 2/4] mm/mempolicy: Support memory hotplug in weighted
 interleave

> Hi David,

Hi :)

> 
> I am currently working on adding memory hotplug-related functionality
> to the weighted interleave feature. While discussing this with Gregory,
> a question came up. If you happen to know the answer to the following,
> I would greatly appreciate your input.
> 
> I have added the following logic to call add_weight_node when a node
> transitions to the N_MEMORY state to create a sysfs entry. Conversely,
> when all memory blocks of a node go offline (!N_MEMORY),
> I call sysfs_wi_node_release to remove the corresponding sysfs entry.
> 

As a spoiler: I don't like how we squeezed the status_change_nid / 
status_change_nid_normal stuff into the memory notifier that works on a 
single memory block -> single zone. But decoupling it is not as easy, 
because we have this status_change_nid vs. status_change_nid_normal thing.

For the basic "node going offline / node going online", a separate 
notifier (or separate callbacks) would make at least your use case much 
clearer.

The whole "status_change_nid_normal" is only used by slab. I wonder if 
we could get rid of it, and simply let slab check the relevant 
zone->present pages when notified about onlining/offlining of a singe 
memory block.

Then, we would only have status_change_nid and could just convert that 
to a NODE_MEM_ONLINE / NODE_MEM_OFFLINE notifier or sth like that.

Hmmm, if I wouldn't be on PTO, I would prototype that real quick :)

> +static int wi_node_notifier(struct notifier_block *nb,
> +                              unsigned long action, void *data)
> +{
> +       int err;
> +       struct memory_notify *arg = data;
> +       int nid = arg->status_change_nid;
> +
> +       if (nid < 0)
> +               goto notifier_end;
> +
> +       switch(action) {
> +       case MEM_ONLINE:
> +               err = add_weight_node(nid, wi_kobj);
> +               if (err) {
> +                       pr_err("failed to add sysfs [node%d]\n", nid);
> +                       kobject_put(wi_kobj);
> +                       return NOTIFY_BAD;
> +               }
> +               break;
> +       case MEM_OFFLINE:
> +               sysfs_wi_node_release(node_attrs[nid], wi_kobj);
> +               break;
> +       }
> +
> +notifier_end:
> +       return NOTIFY_OK;
> +}
> 
> One question I have is whether the MEM_OFFLINE action in the code
> below will be triggered when a node that consists of multiple memory
> blocks has only one of its memory blocks transitioning to the offline state.
> 

node_states_check_changes_offline() should be making sure that that is 
the case.

Only if offlining the current block will make the node (all zones) have 
no present pages any more, then we'll set status_change_nid to >= 0.


> +       int nid = arg->status_change_nid;
> +
> +       if (nid < 0)
> +               goto notifier_end;
> 
> Based on my analysis, wi_node_notifier should function as expected
> because arg->status_change_nid only holds a non-negative value
> under the following conditions:
> 
> 1) !N_MEMORY -> N_MEMORY
>     When the first memory block of a node transitions to the online state,
>     it holds a non-negative value.
>     In all other cases, it remains -1 (NUMA_NO_NODE).
> 
> 2) N_MEMORY -> !N_MEMORY
>     When all memory blocks of a node transition to the offline state,
>     it holds a non-negative value.
>     In all other cases, it remains -1 (NUMA_NO_NODE).
> 
> I would truly appreciate it if you could confirm whether this analysis is correct.
> Below is a more detailed explanation of my findings.
> 

Yes, that's at least how it is supposed to work (-bugs, but I am not 
aware of any) :)

-- 
Cheers,

David / dhildenb


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ