linux-kernel - Re: [PATCH 0/2] Implement numa node notifier

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <b9d5a23c-f97c-4d11-b468-5a83ee2e25e2@redhat.com>
Date: Thu, 3 Apr 2025 15:08:18 +0200
From: David Hildenbrand <david@...hat.com>
To: Oscar Salvador <osalvador@...e.de>, Vlastimil Babka <vbabka@...e.cz>
Cc: Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
 linux-kernel@...r.kernel.org, Hyeonggon Yoo <42.hyeyoo@...il.com>,
 mkoutny@...e.com, Dan Williams <dan.j.williams@...el.com>,
 Jonathan Cameron <Jonathan.Cameron@...wei.com>
Subject: Re: [PATCH 0/2] Implement numa node notifier

On 03.04.25 15:02, David Hildenbrand wrote:
> On 02.04.25 19:03, Oscar Salvador wrote:
>> On Wed, Apr 02, 2025 at 06:06:51PM +0200, Vlastimil Babka wrote:
>>> What if we had two chains:
>>>
>>> register_node_notifier()
>>> register_node_normal_notifier()
>>>
>>> I think they could have shared the state #defines and struct node_notify
>>> would have just one nid and be always >= 0.
>>>
>>> Or would it add too much extra boilerplate and only slab cares?
>>
>> We could indeed go on that direction to try to decouple
>> status_change_nid from status_change_nid_normal.
>>
>> Although as you said, slub is the only user of status_change_nid_normal
>> for the time beign, so I am not sure of adding a second chain for only
>> one user.
>>
>> Might look cleaner though, and the advantatge is that slub would not get
>> notified for nodes adquiring only ZONE_MOVABLE.
>>
>> Let us see what David thinks about it.
> 
> I'd hope we'd be able to get rid of the _normal stuff completely, it's seems
> way to specialized.
> 
> We added that in
> 
> commit b9d5ab2562eceeada5e4837a621b6260574dd11d
> Author: Lai Jiangshan <laijs@...fujitsu.com>
> Date:   Tue Dec 11 16:01:05 2012 -0800
> 
>       slub, hotplug: ignore unrelated node's hot-adding and hot-removing
>       
>       SLUB only focuses on the nodes which have normal memory and it ignores the
>       other node's hot-adding and hot-removing.
>       
>       Aka: if some memory of a node which has no onlined memory is online, but
>       this new memory onlined is not normal memory (for example, highmem), we
>       should not allocate kmem_cache_node for SLUB.
>       
>       And if the last normal memory is offlined, but the node still has memory,
>       we should remove kmem_cache_node for that node.  (The current code delays
>       it when all of the memory is offlined)
>       
>       So we only do something when marg->status_change_nid_normal > 0.
>       marg->status_change_nid is not suitable here.
>       
>       The same problem doesn't exist in SLAB, because SLAB allocates kmem_list3
>       for every node even the node don't have normal memory, SLAB tolerates
>       kmem_list3 on alien nodes.  SLUB only focuses on the nodes which have
>       normal memory, it don't tolerate alien kmem_cache_node.  The patch makes
>       SLUB become self-compatible and avoids WARNs and BUGs in rare conditions.
> 
> 
> How "bad" would it be if we do the slab_mem_going_online_callback() etc even
> for completely-movable nodes? I assume one kmem_cache_alloc() per slab_caches.
> 
> slab_mem_going_offline_callback() only does shrinking, #dontcare
> 
> Looking at slab_mem_offline_callback(), we never even free the caches either
> way when offlining. So the implication would be that we would have movable-only nodes
> set in slab_nodes.
> 
> 
> We don't expect many such nodes, so ... do we care?

BTW, isn't description of slab_nodes wrong?

"Tracks for which NUMA nodes we have kmem_cache_nodes allocated." -- but 
as there is no freeing done in slab_mem_offline_callback(), isn't it 
always kept allocated?

(probably I am missing something)

-- 
Cheers,

David / dhildenb