[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b0d4db87-1d58-4877-8a64-55a71f1960d1@kernel.org>
Date: Thu, 22 Jan 2026 23:44:02 +0100
From: "David Hildenbrand (Red Hat)" <david@...nel.org>
To: Gregory Price <gourry@...rry.net>
Cc: linux-mm@...ck.org, linux-cxl@...r.kernel.org, nvdimm@...ts.linux.dev,
linux-kernel@...r.kernel.org, virtualization@...ts.linux.dev,
kernel-team@...a.com, dan.j.williams@...el.com, vishal.l.verma@...el.com,
dave.jiang@...el.com, mst@...hat.com, jasowang@...hat.com,
xuanzhuo@...ux.alibaba.com, eperezma@...hat.com, osalvador@...e.de,
akpm@...ux-foundation.org, Hannes Reinecke <hare@...e.de>
Subject: Re: [PATCH 8/8] dax/kmem: add memory notifier to block external state
changes
On 1/14/26 18:07, Gregory Price wrote:
> On Wed, Jan 14, 2026 at 10:44:08AM +0100, David Hildenbrand (Red Hat) wrote:
>> On 1/14/26 09:52, Gregory Price wrote:
>>> Add a memory notifier to prevent external operations from changing the
>>> online/offline state of memory blocks managed by dax_kmem. This ensures
>>> state changes only occur through the driver's hotplug sysfs interface,
>>> providing consistent state tracking and preventing races with auto-online
>>> policies or direct memory block sysfs manipulation.
>>>
>>> The notifier uses a transition protocol with memory barriers:
>>> - Before initiating a state change, set target_state then in_transition
>>> - Use a barrier to ensure target_state is visible before in_transition
>>> - The notifier checks in_transition, then uses barrier before reading
>>> target_state to ensure proper ordering on weakly-ordered architectures
>>>
>>> The notifier callback:
>>> - Returns NOTIFY_DONE for non-overlapping memory (not our concern)
>>> - Returns NOTIFY_BAD if in_transition is false (block external ops)
>>> - Validates the memory event matches target_state (MEM_GOING_ONLINE
>>> for online operations, MEM_GOING_OFFLINE for offline/unplug)
>>> - Returns NOTIFY_OK only for driver-initiated operations with matching
>>> target_state
>>>
>>> This prevents scenarios where:
>>> - Auto-online policies re-online memory the driver is trying to offline
>>
>> Is this still a problem when using offline_and_remove_memory() ?
>>
>
> I suppose this commit more than the others is actually an RFC.
>
> DAX might not want it. Other drivers might. Now at least I have the
> code to do that.
>
>>> - Users manually change memory state via /sys/devices/system/memory/
>>
>> I don't see why we would want to care about that :)
>>
>
> Absolutely critical if we have something like a CXL DCD region that
> wants to try to protect hot-unplug. But that is probably an argument
> for implementing this in a cxl region driver than DAX.
I asked further questions about that in reply to the other longer mail.
Don't try to make the whole CXL DCD more special than anything else we
already have, otherwise you'll end up creating a mess for user space.
--
Cheers
David
Powered by blists - more mailing lists