[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aPbOfFPIhtu5npaG@aschofie-mobl2.lan>
Date: Mon, 20 Oct 2025 17:06:20 -0700
From: Alison Schofield <alison.schofield@...el.com>
To: "Koralahalli Channabasappa, Smita" <skoralah@....com>
CC: Smita Koralahalli <Smita.KoralahalliChannabasappa@....com>,
<linux-cxl@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<nvdimm@...ts.linux.dev>, <linux-fsdevel@...r.kernel.org>,
<linux-pm@...r.kernel.org>, Davidlohr Bueso <dave@...olabs.net>, "Jonathan
Cameron" <jonathan.cameron@...wei.com>, Dave Jiang <dave.jiang@...el.com>,
Vishal Verma <vishal.l.verma@...el.com>, Ira Weiny <ira.weiny@...el.com>,
"Dan Williams" <dan.j.williams@...el.com>, Matthew Wilcox
<willy@...radead.org>, Jan Kara <jack@...e.cz>, "Rafael J . Wysocki"
<rafael@...nel.org>, Len Brown <len.brown@...el.com>, Pavel Machek
<pavel@...nel.org>, Li Ming <ming.li@...omail.com>, Jeff Johnson
<jeff.johnson@....qualcomm.com>, "Ying Huang" <huang.ying.caritas@...il.com>,
Yao Xingtao <yaoxt.fnst@...itsu.com>, Peter Zijlstra <peterz@...radead.org>,
Greg KH <gregkh@...uxfoundation.org>, Nathan Fontenot
<nathan.fontenot@....com>, Terry Bowman <terry.bowman@....com>, Robert
Richter <rrichter@....com>, Benjamin Cheatham <benjamin.cheatham@....com>,
Zhijian Li <lizhijian@...itsu.com>, "Borislav Petkov" <bp@...en8.de>, Ard
Biesheuvel <ardb@...nel.org>
Subject: Re: [PATCH v3 0/5] dax/hmem, cxl: Coordinate Soft Reserved handling
with CXL
On Tue, Oct 14, 2025 at 10:52:20AM -0700, Koralahalli Channabasappa, Smita wrote:
> Hi Alison,
>
> On 10/10/2025 1:49 PM, Alison Schofield wrote:
> > On Mon, Oct 06, 2025 at 06:16:24PM -0700, Alison Schofield wrote:
> > > On Tue, Sep 30, 2025 at 04:47:52AM +0000, Smita Koralahalli wrote:
> > > > This series aims to address long-standing conflicts between dax_hmem and
> > > > CXL when handling Soft Reserved memory ranges.
> > >
> > > Hi Smita,
> > >
> > > Thanks for the updates Smita!
> > >
> > > About those "long-standing conflicts": In the next rev, can you resurrect,
> > > or recreate the issues list that this set is addressing. It's been a
> > > long and winding road with several handoffs (me included) and it'll help
> > > keep the focus.
> > >
> > > Hotplug works :) Auto region comes up, we tear it down and can recreate it,
> > > in place, because the soft reserved resource is gone (no longer occupying
> > > the CXL Window and causing recreate to fail.)
> > >
> > > !CONFIG_CXL_REGION works :) All resources go directly to DAX.
> > >
> > > The scenario that is failing is handoff to DAX after region assembly
> > > failure. (Dan reminded me to check that today.) That is mostly related
> > > to Patch4, so I'll respond there.
> > >
> > > --Alison
> >
> > Hi Smita -
> >
> > (after off-list chat w Smita about what is and is not included)
> >
> > This CXL failover to DAX case is not implemented. In my response in Patch 4,
> > I cobbled something together that made it work in one test case. But to be
> > clear, there was some trickery in the CXL region driver to even do that.
> >
> > One path forward is to update this set restating the issues it addresses, and
> > remove any code and comments that are tied to failing over to DAX after a
> > region assembly failure.
> >
> > That leaves the issue Dan raised, "shutdown CXL in favor of vanilla DAX devices
> > as an emergency fallback for platform configuration quirks and bugs"[1], for a
> > future patch.
> >
> > -- Alison
> >
> > [1] The failover to DAX was last described in response to v5 of the 'prior' patchset.
> > https://lore.kernel.org/linux-cxl/20250715180407.47426-1-Smita.KoralahalliChannabasappa@amd.com/
> > https://lore.kernel.org/linux-cxl/687ffcc0ee1c8_137e6b100ed@dwillia2-xfh.jf.intel.com.notmuch/
> > https://lore.kernel.org/linux-cxl/68808fb4e4cbf_137e6b100cc@dwillia2-xfh.jf.intel.com.notmuch/
>
> [+cc Nathan, Terry]
>
> From the AMD side, our primary concern in this series is CXL hotplug. With
> the patches as is, the hotplug flows are working for us: region comes up, we
> can tear it down, and recreate it in place because the soft reserved window
> is released.
>
> On our systems I consistently see wait_for_device_probe() block until region
> assembly has completed so I don’t currently have evidence of a sequencing
> hole there on AMD platforms.
>
> Once CXL windows are discovered, would it be acceptable for dax_hmem to
> simply ignore soft reserved ranges inside those windows, assuming CXL will
> own and manage them? That aligns with Dan’s guidance about letting CXL win
> those ranges when present.
> https://lore.kernel.org/all/687fef9ec0dd9_137e6b100c8@dwillia2-xfh.jf.intel.com.notmuch/
>
> If that approach sounds right, I can reword the commit descriptions in
> patches 4/5 and 5/5 to drop the parts about region assembly failures and
> remove the REGISTER enum.
>
> And then leave the “shutdown CXL in favor of vanilla DAX as an emergency
> fallback for platform configuration quirks and bugs” to a future, dedicated
> patch.
>
> Thanks
> Smita
Hi Smita,
I was able to discard the big sleep after picking up the patch "cxl/mem:
Arrange for always-synchronous memdev attach" from Alejandro's Type2 set.
With that patch, all CXL probing completed before the HMEM probe so the
deferred waiting mechanism of the HMEM driver seems unnecessary. Please
take a look.
That patch, is one of four in this branch Dan provided:
https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=for-6.18/cxl-probe-order
After chats with Dan and DaveJ, we thought the Soft Reserved set was the
right place to introduce these probe order patches (let Type 2 follow).
So, the SR set adds these three patches:
- **cxl/mem: Arrange for always-synchronous memdev attach**
- cxl/port: Arrange for always synchronous endpoint attach
- cxl/mem: Introduce a memdev creation ->probe() operation
**I actually grabbed this one from v19 Type2 set, not the CXL branch,
so you may need to see if Alejandro changed anything in that one.
When picking those up, there's a bit of wordsmithing to do in the
commit logs. Probably replace mentions of needing for accelerators
with needing for synchronizing the usage of soft-reserved resources.
Note that the HMEM driver is also not picking up unused SR ranges.
That was described in review comments here:
https://lore.kernel.org/linux-cxl/aORscMprmQyGlohw@aschofie-mobl2.lan
Summarized for my benefit ;)
- pick up all the probe order patches,
- determine whether the HMEM deferral is needed, maybe drop it,
- register the unused SR, don't drop based on intersect w 'CXL Window'
With all that, nothing would be left undone in the HMEM driver. The region
driver would still need to fail gracefully and release resources in a
follow-on patch.
Let me know what you find wrt the timing, ie is the wait_for_device_probe()
needed at all?
Thanks!
-- Alison
>
> >
> > >
> > >
>
Powered by blists - more mailing lists