netdev - Re: [PATCH v19 01/22] cxl/mem: Arrange for always-synchronous memdev attach

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f94420fc-0b06-4d55-8178-b5eb07bc4bcd@amd.com>
Date: Mon, 10 Nov 2025 10:43:14 +0000
From: Alejandro Lucero Palau <alucerop@....com>
To: "Koralahalli Channabasappa, Smita" <skoralah@....com>,
 Jonathan Cameron <jonathan.cameron@...wei.com>,
 alejandro.lucero-palau@....com
Cc: linux-cxl@...r.kernel.org, netdev@...r.kernel.org,
 dan.j.williams@...el.com, edward.cree@....com, davem@...emloft.net,
 kuba@...nel.org, pabeni@...hat.com, edumazet@...gle.com,
 dave.jiang@...el.com, Alison Schofield <alison.schofield@...el.com>
Subject: Re: [PATCH v19 01/22] cxl/mem: Arrange for always-synchronous memdev
 attach

Hi Smita,


On 10/30/25 19:57, Koralahalli Channabasappa, Smita wrote:
> Hi Alejandro,
>
> I need patches 1–3 from this series as prerequisites for the 
> Soft-Reserved coordination work, so I wanted to check in on your plans 
> for the next revision.
>
> Link to the discussion: 
> https://lore.kernel.org/all/aPbOfFPIhtu5npaG@aschofie-mobl2.lan/
>
> Are patches 1–3 already being updated as part of your v20 work?
> If so, I can wait and pick them up from v20 directly.


Yes, I'm sending v20 later today. This patch has some changes for fixing 
the reported problems.


>
> If v20 is still in progress and may take some time, I can probably 
> carry patches 1–3 at the start of my series, and if that helps, I can 
> fold in the review comments by Jonathan while keeping authorship as 
> is. I would only adjust wording in the commit descriptions to reflect 
> the Soft-Reserved coordination context.
>
> Alternatively, if you prefer to continue carrying them in the Type-2 
> series, I can simply reference them as prerequisites instead.
>
> I’m fine with either approach just trying to avoid duplicated effort 
> and keep review in one place.


Let's see how v20 is received regarding the potential merge. If not 
impending, you could take those patches.


Thanks


>
> Thanks
> Smita
>
> On 10/29/2025 4:20 AM, Alejandro Lucero Palau wrote:
>>
>> On 10/7/25 13:40, Jonathan Cameron wrote:
>>> On Mon, 6 Oct 2025 11:01:09 +0100
>>> <alejandro.lucero-palau@....com> wrote:
>>>
>>>> From: Alejandro Lucero <alucerop@....com>
>>>>
>>>> In preparation for CXL accelerator drivers that have a hard 
>>>> dependency on
>>>> CXL capability initialization, arrange for the endpoint probe 
>>>> result to be
>>>> conveyed to the caller of devm_cxl_add_memdev().
>>>>
>>>> As it stands cxl_pci does not care about the attach state of the 
>>>> cxl_memdev
>>>> because all generic memory expansion functionality can be handled 
>>>> by the
>>>> cxl_core. For accelerators, that driver needs to know perform driver
>>>> specific initialization if CXL is available, or exectute a fallback 
>>>> to PCIe
>>>> only operation.
>>>>
>>>> By moving devm_cxl_add_memdev() to cxl_mem.ko it removes async module
>>>> loading as one reason that a memdev may not be attached upon return 
>>>> from
>>>> devm_cxl_add_memdev().
>>>>
>>>> The diff is busy as this moves cxl_memdev_alloc() down below the 
>>>> definition
>>>> of cxl_memdev_fops and introduces devm_cxl_memdev_add_or_reset() to
>>>> preclude needing to export more symbols from the cxl_core.
>>>>
>>>> Signed-off-by: Dan Williams <dan.j.williams@...el.com>
>>> Alejandro, SoB chain broken here which makes this currently 
>>> unmergeable.
>>>
>>> Should definitely have your SoB as you sent the patch to the list 
>>> and need
>>> to make a statement that you believe it to be fine to do so (see the 
>>> Certificate
>>> of origin stuff in the docs).  Also, From should always be one of 
>>> the authors.
>>> If Dan wrote this as the SoB suggests then From should be set to him..
>>>
>>> git commit --amend --author="Dan Williams <dan.j.williams@...el.com>"
>>>
>>> Will fix that up.  Then either you add your SoB on basis you just 
>>> 'handled'
>>> the patch but didn't make substantial changes, or your SoB and a 
>>> Codeveloped-by
>>> if you did make major changes.  If it is minor stuff you can an
>>> a sign off with # what changed
>>> comment next to it.
>>
>>
>> Understood. I'll ask Dan what he prefers.
>>
>>
>>>
>>> A few minor comments inline.
>>>
>>> Thanks,
>>>
>>> Jonathan
>>>
>>>
>>>> ---
>>>>   drivers/cxl/Kconfig       |  2 +-
>>>>   drivers/cxl/core/memdev.c | 97 
>>>> ++++++++++++++++-----------------------
>>>>   drivers/cxl/mem.c         | 30 ++++++++++++
>>>>   drivers/cxl/private.h     | 11 +++++
>>>>   4 files changed, 82 insertions(+), 58 deletions(-)
>>>>   create mode 100644 drivers/cxl/private.h
>>>>
>>>> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
>>>> index 028201e24523..111e05615f09 100644
>>>> --- a/drivers/cxl/Kconfig
>>>> +++ b/drivers/cxl/Kconfig
>>>> @@ -22,6 +22,7 @@ if CXL_BUS
>>>>   config CXL_PCI
>>>>       tristate "PCI manageability"
>>>>       default CXL_BUS
>>>> +    select CXL_MEM
>>>>       help
>>>>         The CXL specification defines a "CXL memory device" 
>>>> sub-class in the
>>>>         PCI "memory controller" base class of devices. Device's 
>>>> identified by
>>>> @@ -89,7 +90,6 @@ config CXL_PMEM
>>>>   config CXL_MEM
>>>>       tristate "CXL: Memory Expansion"
>>>> -    depends on CXL_PCI
>>>>       default CXL_BUS
>>>>       help
>>>>         The CXL.mem protocol allows a device to act as a provider 
>>>> of "System
>>>> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
>>>> index c569e00a511f..2bef231008df 100644
>>>> --- a/drivers/cxl/core/memdev.c
>>>> +++ b/drivers/cxl/core/memdev.c
>>>> -
>>>> -err:
>>>> -    kfree(cxlmd);
>>>> -    return ERR_PTR(rc);
>>>>   }
>>>> +EXPORT_SYMBOL_NS_GPL(devm_cxl_memdev_add_or_reset, "CXL");
>>>>   static long __cxl_memdev_ioctl(struct cxl_memdev *cxlmd, unsigned 
>>>> int cmd,
>>>>                      unsigned long arg)
>>>> @@ -1023,50 +1012,44 @@ static const struct file_operations 
>>>> cxl_memdev_fops = {
>>>>       .llseek = noop_llseek,
>>>>   };
>>>> -struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
>>>> -                       struct cxl_dev_state *cxlds)
>>>> +struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds)
>>>>   {
>>>>       struct cxl_memdev *cxlmd;
>>>>       struct device *dev;
>>>>       struct cdev *cdev;
>>>>       int rc;
>>>> -    cxlmd = cxl_memdev_alloc(cxlds, &cxl_memdev_fops);
>>>> -    if (IS_ERR(cxlmd))
>>>> -        return cxlmd;
>>>> +    cxlmd = kzalloc(sizeof(*cxlmd), GFP_KERNEL);
>>> It's a little bit non obvious due to the device initialize mid way
>>> through this, but given there are no error paths after that you can
>>> currently just do.
>>>     struct cxl_memdev *cxlmd __free(kfree) =
>>>         cxl_memdev_alloc(cxlds, &cxl_memdev_fops);
>>> and
>>>     return_ptr(cxlmd);
>>>
>>> in the good path.  That lets you then just return rather than having
>>> the goto err: handling for the error case that currently frees this
>>> manually.
>>>
>>> Unlike the change below, this one I think is definitely worth making.
>>
>>
>> I agree so I'll do it. The below suggestion is also needed ...
>>
>>
>>>
>>>> +    if (!cxlmd)
>>>> +        return ERR_PTR(-ENOMEM);
>>>> -    dev = &cxlmd->dev;
>>>> -    rc = dev_set_name(dev, "mem%d", cxlmd->id);
>>>> -    if (rc)
>>>> +    rc = ida_alloc_max(&cxl_memdev_ida, CXL_MEM_MAX_DEVS - 1, 
>>>> GFP_KERNEL);
>>>> +    if (rc < 0)
>>>>           goto err;
>>>> -
>>>> -    /*
>>>> -     * Activate ioctl operations, no cxl_memdev_rwsem manipulation
>>>> -     * needed as this is ordered with cdev_add() publishing the 
>>>> device.
>>>> -     */
>>>> +    cxlmd->id = rc;
>>>> +    cxlmd->depth = -1;
>>>>       cxlmd->cxlds = cxlds;
>>>>       cxlds->cxlmd = cxlmd;
>>>> -    cdev = &cxlmd->cdev;
>>>> -    rc = cdev_device_add(cdev, dev);
>>>> -    if (rc)
>>>> -        goto err;
>>>> +    dev = &cxlmd->dev;
>>>> +    device_initialize(dev);
>>>> +    lockdep_set_class(&dev->mutex, &cxl_memdev_key);
>>>> +    dev->parent = cxlds->dev;
>>>> +    dev->bus = &cxl_bus_type;
>>>> +    dev->devt = MKDEV(cxl_mem_major, cxlmd->id);
>>>> +    dev->type = &cxl_memdev_type;
>>>> +    device_set_pm_not_required(dev);
>>>> +    INIT_WORK(&cxlmd->detach_work, detach_memdev);
>>>> -    rc = devm_add_action_or_reset(host, cxl_memdev_unregister, 
>>>> cxlmd);
>>>> -    if (rc)
>>>> -        return ERR_PTR(rc);
>>>> +    cdev = &cxlmd->cdev;
>>>> +    cdev_init(cdev, &cxl_memdev_fops);
>>>>       return cxlmd;
>>>>   err:
>>>> -    /*
>>>> -     * The cdev was briefly live, shutdown any ioctl operations that
>>>> -     * saw that state.
>>>> -     */
>>>> -    cxl_memdev_shutdown(dev);
>>>> -    put_device(dev);
>>>> +    kfree(cxlmd);
>>>>       return ERR_PTR(rc);
>>>>   }
>>>> -EXPORT_SYMBOL_NS_GPL(devm_cxl_add_memdev, "CXL");
>>>> +EXPORT_SYMBOL_NS_GPL(cxl_memdev_alloc, "CXL");
>>>>   static void sanitize_teardown_notifier(void *data)
>>>>   {
>>>> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
>>>> index f7dc0ba8905d..144749b9c818 100644
>>>> --- a/drivers/cxl/mem.c
>>>> +++ b/drivers/cxl/mem.c
>>>> @@ -7,6 +7,7 @@
>>>>   #include "cxlmem.h"
>>>>   #include "cxlpci.h"
>>>> +#include "private.h"
>>>>   #include "core/core.h"
>>>>   /**
>>>> @@ -203,6 +204,34 @@ static int cxl_mem_probe(struct device *dev)
>>>>       return devm_add_action_or_reset(dev, enable_suspend, NULL);
>>>>   }
>>>> +/**
>>>> + * devm_cxl_add_memdev - Add a CXL memory device
>>>> + * @host: devres alloc/release context and parent for the memdev
>>>> + * @cxlds: CXL device state to associate with the memdev
>>>> + *
>>>> + * Upon return the device will have had a chance to attach to the
>>>> + * cxl_mem driver, but may fail if the CXL topology is not ready
>>>> + * (hardware CXL link down, or software platform CXL root not 
>>>> attached)
>>>> + */
>>>> +struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
>>>> +                       struct cxl_dev_state *cxlds)
>>>> +{
>>>> +    struct cxl_memdev *cxlmd = cxl_memdev_alloc(cxlds);
>>> Bit marginal but you could do a DEFINE_FREE() for cxlmd
>>> similar to the one that exists for put_cxl_port
>>>
>>> You would then need to steal the pointer for the devm_ call at the
>>> end of this function.
>>
>>
>> We are not freeing cxlmd in case of errors after we got the 
>> allocation, so I think it makes sense.
>>
>>
>> Thank you.
>>
>>
>>>
>>>> +    int rc;
>>>> +
>>>> +    if (IS_ERR(cxlmd))
>>>> +        return cxlmd;
>>>> +
>>>> +    rc = dev_set_name(&cxlmd->dev, "mem%d", cxlmd->id);
>>>> +    if (rc) {
>>>> +        put_device(&cxlmd->dev);
>>>> +        return ERR_PTR(rc);
>>>> +    }
>>>> +
>>>> +    return devm_cxl_memdev_add_or_reset(host, cxlmd);
>>>> +}
>>>> +EXPORT_SYMBOL_NS_GPL(devm_cxl_add_memdev, "CXL");
>>
>