[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aWfUAiag6khaLJpq@gourry-fedora-PF4VCD3F>
Date: Wed, 14 Jan 2026 12:36:02 -0500
From: Gregory Price <gourry@...rry.net>
To: "David Hildenbrand (Red Hat)" <david@...nel.org>
Cc: linux-mm@...ck.org, linux-cxl@...r.kernel.org, nvdimm@...ts.linux.dev,
linux-kernel@...r.kernel.org, virtualization@...ts.linux.dev,
kernel-team@...a.com, dan.j.williams@...el.com,
vishal.l.verma@...el.com, dave.jiang@...el.com, mst@...hat.com,
jasowang@...hat.com, xuanzhuo@...ux.alibaba.com,
eperezma@...hat.com, osalvador@...e.de, akpm@...ux-foundation.org,
Hannes Reinecke <hare@...e.de>
Subject: Re: [PATCH 8/8] dax/kmem: add memory notifier to block external
state changes
On Wed, Jan 14, 2026 at 10:44:08AM +0100, David Hildenbrand (Red Hat) wrote:
> On 1/14/26 09:52, Gregory Price wrote:
> > Add a memory notifier to prevent external operations from changing the
> > online/offline state of memory blocks managed by dax_kmem. This ensures
> > state changes only occur through the driver's hotplug sysfs interface,
> > providing consistent state tracking and preventing races with auto-online
> > policies or direct memory block sysfs manipulation.
> >
> > The notifier uses a transition protocol with memory barriers:
> > - Before initiating a state change, set target_state then in_transition
> > - Use a barrier to ensure target_state is visible before in_transition
> > - The notifier checks in_transition, then uses barrier before reading
> > target_state to ensure proper ordering on weakly-ordered architectures
> >
> > The notifier callback:
> > - Returns NOTIFY_DONE for non-overlapping memory (not our concern)
> > - Returns NOTIFY_BAD if in_transition is false (block external ops)
> > - Validates the memory event matches target_state (MEM_GOING_ONLINE
> > for online operations, MEM_GOING_OFFLINE for offline/unplug)
> > - Returns NOTIFY_OK only for driver-initiated operations with matching
> > target_state
> >
> > This prevents scenarios where:
> > - Auto-online policies re-online memory the driver is trying to offline
>
> Is this still a problem when using offline_and_remove_memory() ?
>
I just remembered another reason I did this:
echo offline > memoryN/state
This leaves the dax/hotplug state in an inconsistent state.
if you do the above for every block in a dax region, `daxN.M/hotplug`
still shows up as online.
This just hard-locks the state to consistent (unless an online/offline
fails along with its rollback).
The additional complexity seemed warranted for that, but if you're happy
to leave users to their footguns I'm not going to argue it.
---
I just realized this breaks the current ndctl pattern and would force
ndctl to convert to `hotplug` since memory block onlining will fail.
~Gregory
Powered by blists - more mailing lists