lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b1b4ce23-d02c-46ec-96fe-ada6ae0948c6@suse.de>
Date: Mon, 12 Jan 2026 08:28:29 +0100
From: Hannes Reinecke <hare@...e.de>
To: Gregory Price <gourry@...rry.net>,
 "David Hildenbrand (Red Hat)" <david@...nel.org>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org, kernel-team@...a.com,
 osalvador@...e.de, gregkh@...uxfoundation.org, rafael@...nel.org,
 dakr@...nel.org, akpm@...ux-foundation.org, lorenzo.stoakes@...cle.com,
 Liam.Howlett@...cle.com, vbabka@...e.cz, rppt@...nel.org, surenb@...gle.com,
 mhocko@...e.com
Subject: Re: [RFC PATCH] memory,memory_hotplug: allow restricting memory
 blocks to zone movable

On 1/9/26 17:41, Gregory Price wrote:
> On Thu, Jan 08, 2026 at 03:16:24PM +0100, David Hildenbrand (Red Hat) wrote:
>> On 1/8/26 08:31, Hannes Reinecke wrote:
>>> On 1/6/26 21:22, David Hildenbrand (Red Hat) wrote:
>>>> On 1/6/26 20:59, Gregory Price wrote:
>>
>>> For hardware-based scenarios memory will always be removed in
>>> larger entities (eg the CXL device), and it's always an 'all-or-nothing'
>>> scenario; you cannot remove individual memory blocks on a CXL device.
>>> So there the memory block abstraction makes less sense, and it
>>> would be good to have a single 'knob' to remove the entire CXL
>>> device and all memory blocks on it.
>>> Sure, it might take some time, but one doesn't need to worry about
>>> restoring the original state if the operation on one block fails.
>>
>> That's not what I was getting at:
>>
>> offline_and_remove_memory() can be called on large regions, and it properly
>> handles whether we have to back out because some offlining failed.
>>
>> The issue arises once dax would have to call offline_and_remove_memory()
>> multiple times, on non-contiguous areas. Of course, we could handle that by
>> providing an interface that consumes multiple memory ranges.
>>
>> For the DAX use case, I thing we'd really want a way to just use
>>
>> * add_and_online_memory() [does not exist yet, but ppc does something
>>    similar]
>> * offline_and_remove_memory()
>>
> 
> I'm starting to think this issue is actually the result of bad patterns
> in the cxl driver - namely using dax as a path to hotplug sysram.
> 
> I suppose either we need a `cxl/dax_region/remove` that handles the
> whole operation in one go, or
> 
> we want `cxl/region/commit` to handle hot(un)plug as a single action.
> 
> tl;dr:  Split the dax use case from the sysram use case, and make a
>          cxl sysram driver directly manage hotplug rather than use dax.
> 

Well ... not sure.
We are doing fine even currently during boot up; we can align policies
and everything to ensure the system comes up with the 'correct' setting
Things start to get iffy if one is reconfiguring memory to move from
daxdev to system ram and vice versa.
Currently we can do this with a simple memory online/offline; with your
suggestion we would need to remove the memory, too, when doing that.
Might be getting even more awkward as this most likely involves calling
the hotplug functions for the CXL device itself ...

So not sure if it's a win. But one should try and see where we end up.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke                  Kernel Storage Architect
hare@...e.de                                +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ