lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <374f8a2c-df06-4df9-8816-d91d3236cd58@amd.com>
Date: Fri, 14 Nov 2025 11:10:50 +0000
From: Alejandro Lucero Palau <alucerop@....com>
To: Jonathan Cameron <jonathan.cameron@...wei.com>,
 alejandro.lucero-palau@....com
Cc: linux-cxl@...r.kernel.org, netdev@...r.kernel.org,
 dan.j.williams@...el.com, edward.cree@....com, davem@...emloft.net,
 kuba@...nel.org, pabeni@...hat.com, edumazet@...gle.com, dave.jiang@...el.com
Subject: Re: [PATCH v20 01/22] cxl/mem: Arrange for always-synchronous memdev
 attach


On 11/12/25 14:53, Jonathan Cameron wrote:
> On Mon, 10 Nov 2025 15:36:36 +0000
> alejandro.lucero-palau@....com wrote:
>
>> From: Dan Williams <dan.j.williams@...el.com>
>>
>> In preparation for CXL accelerator drivers that have a hard dependency on
>> CXL capability initialization, arrange for the endpoint probe result to be
>> conveyed to the caller of devm_cxl_add_memdev().
>>
>> As it stands cxl_pci does not care about the attach state of the cxl_memdev
>> because all generic memory expansion functionality can be handled by the
>> cxl_core. For accelerators, that driver needs to know perform driver
>> specific initialization if CXL is available, or exectute a fallback to PCIe
>> only operation.
>>
>> By moving devm_cxl_add_memdev() to cxl_mem.ko it removes async module
>> loading as one reason that a memdev may not be attached upon return from
>> devm_cxl_add_memdev().
>>
>> The diff is busy as this moves cxl_memdev_alloc() down below the definition
>> of cxl_memdev_fops and introduces devm_cxl_memdev_add_or_reset() to
>> preclude needing to export more symbols from the cxl_core.
>>
>> Signed-off-by: Dan Williams <dan.j.williams@...el.com>
> Alejandro, read submitting patches again.  Whilst the first sign off should
> indeed by Dan's this also needs one from you as a 'handler' of the patch.
>
> Be very careful checking these tag chains. If they are wrong no one can
> merge the set and it just acts as a silly blocker.


Hi Jonathan,


I did the amend but it is true I did some work on it. Would it be enough 
to add my signed-off-by along with Dan's one?


> I would have split this up and made the changes to cxl_memdev_alloc in
> a precursor patch (use of __free is obvious one) then could have stated
> that that was simply moved in this patch.


OK. I think I was fixing a bug in original Dan's patch regarding cxlmd 
release in case of error inside devm_cxl_add_memdev, but I think the bug 
is in the current code of that function as it is not properly released 
if error after a successful allocation. So splitting the patch could 
allow to make this clearer and adding the Fixes tag as well.


> There are other changes in there that are really hard to spot though
> and I think there are some bugs lurking in error paths.


I did spot one after your comment, checking cxlmd pointer is not an 
error pointer inside __cxlmd_free. If you spotted something else, please 
tell me :-)


Thank you!


> Jonathan
>
>> ---
>>   drivers/cxl/Kconfig       |   2 +-
>>   drivers/cxl/core/memdev.c | 101 ++++++++++++++------------------------
>>   drivers/cxl/mem.c         |  41 ++++++++++++++++
>>   drivers/cxl/private.h     |  10 ++++
>>   4 files changed, 90 insertions(+), 64 deletions(-)
>>   create mode 100644 drivers/cxl/private.h
>> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
>> index e370d733e440..14b4601faf66 100644
>> --- a/drivers/cxl/core/memdev.c
>> +++ b/drivers/cxl/core/memdev.c
>> @@ -8,6 +8,7 @@
>>   #include <linux/idr.h>
>>   #include <linux/pci.h>
>>   #include <cxlmem.h>
>> +#include "private.h"
>>   #include "trace.h"
>>   #include "core.h"
>>   
>> @@ -648,42 +649,25 @@ static void detach_memdev(struct work_struct *work)
>>   
>>   static struct lock_class_key cxl_memdev_key;
>>   
>> -static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
>> -					   const struct file_operations *fops)
>> +int devm_cxl_memdev_add_or_reset(struct device *host, struct cxl_memdev *cxlmd)
>>   {
>> -	struct cxl_memdev *cxlmd;
>> -	struct device *dev;
>> -	struct cdev *cdev;
>> +	struct device *dev = &cxlmd->dev;
>> +	struct cdev *cdev = &cxlmd->cdev;
>>   	int rc;
>>   
>> -	cxlmd = kzalloc(sizeof(*cxlmd), GFP_KERNEL);
>> -	if (!cxlmd)
>> -		return ERR_PTR(-ENOMEM);
>> -
>> -	rc = ida_alloc_max(&cxl_memdev_ida, CXL_MEM_MAX_DEVS - 1, GFP_KERNEL);
>> -	if (rc < 0)
>> -		goto err;
>> -	cxlmd->id = rc;
>> -	cxlmd->depth = -1;
>> -
>> -	dev = &cxlmd->dev;
>> -	device_initialize(dev);
>> -	lockdep_set_class(&dev->mutex, &cxl_memdev_key);
>> -	dev->parent = cxlds->dev;
>> -	dev->bus = &cxl_bus_type;
>> -	dev->devt = MKDEV(cxl_mem_major, cxlmd->id);
>> -	dev->type = &cxl_memdev_type;
>> -	device_set_pm_not_required(dev);
>> -	INIT_WORK(&cxlmd->detach_work, detach_memdev);
>> -
>> -	cdev = &cxlmd->cdev;
>> -	cdev_init(cdev, fops);
>> -	return cxlmd;
>> +	rc = cdev_device_add(cdev, dev);
>> +	if (rc) {
>> +		/*
>> +		 * The cdev was briefly live, shutdown any ioctl operations that
>> +		 * saw that state.
>> +		 */
>> +		cxl_memdev_shutdown(dev);
>> +		return rc;
>> +	}
>>   
>> -err:
>> -	kfree(cxlmd);
>> -	return ERR_PTR(rc);
>> +	return devm_add_action_or_reset(host, cxl_memdev_unregister, cxlmd);
>>   }
>> +EXPORT_SYMBOL_NS_GPL(devm_cxl_memdev_add_or_reset, "CXL");
>>   
>>   static long __cxl_memdev_ioctl(struct cxl_memdev *cxlmd, unsigned int cmd,
>>   			       unsigned long arg)
>> @@ -1051,50 +1035,41 @@ static const struct file_operations cxl_memdev_fops = {
>>   	.llseek = noop_llseek,
>>   };
>>   
>> -struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
>> -				       struct cxl_dev_state *cxlds)
>> +struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds)
>>   {
>> -	struct cxl_memdev *cxlmd;
>> +	struct cxl_memdev *cxlmd __free(kfree) =
>> +		kzalloc(sizeof(*cxlmd), GFP_KERNEL);
> Trivial and perhaps not worth the hassle.
> I'd pull this out of the declarations block to have
>
>   	struct device *dev;
>   	struct cdev *cdev;
>   	int rc;
>
> 	struct cxl_memdev *cxlmd __free(kfree) =
> 		kzalloc(sizeof(*cxlmd), GFP_KERNEL);
> 	if (!cxlmd)
> 		return ERR_PTR(-ENOMEM);
>
> That is treat the __free() related statement as an inline declaration of
> the type we only really allow for these.
>
>
>>   	struct device *dev;
>>   	struct cdev *cdev;
>>   	int rc;
>>   
>> -	cxlmd = cxl_memdev_alloc(cxlds, &cxl_memdev_fops);
>> -	if (IS_ERR(cxlmd))
>> -		return cxlmd;
>>   
>> -	dev = &cxlmd->dev;
>> -	rc = dev_set_name(dev, "mem%d", cxlmd->id);
>> -	if (rc)
>> -		goto err;
>> +	if (!cxlmd)
>> +		return ERR_PTR(-ENOMEM);
>>   
>> -	/*
>> -	 * Activate ioctl operations, no cxl_memdev_rwsem manipulation
>> -	 * needed as this is ordered with cdev_add() publishing the device.
>> -	 */
>> +	rc = ida_alloc_max(&cxl_memdev_ida, CXL_MEM_MAX_DEVS - 1, GFP_KERNEL);
>> +	if (rc < 0)
>> +		return ERR_PTR(rc);
>> +	cxlmd->id = rc;
>> +	cxlmd->depth = -1;
>>   	cxlmd->cxlds = cxlds;
>>   	cxlds->cxlmd = cxlmd;
> These two lines weren't previously in cxl_memdev_alloc()
> I'd like a statement in the commit message of why they are now. It seems
> harmless because they are still ordered before the add and are
> ultimately freed
>
> I'm not immediately spotting why they now are.  This whole code shift
> and complex diff is enough of a pain I'd be tempted to do the move first
> so that we can then see what is actually changed much more easily.
>
>
>>   
>> -	cdev = &cxlmd->cdev;
>> -	rc = cdev_device_add(cdev, dev);
>> -	if (rc)
>> -		goto err;
>> -
>> -	rc = devm_add_action_or_reset(host, cxl_memdev_unregister, cxlmd);
>> -	if (rc)
>> -		return ERR_PTR(rc);
>> -	return cxlmd;
>> +	dev = &cxlmd->dev;
>> +	device_initialize(dev);
>> +	lockdep_set_class(&dev->mutex, &cxl_memdev_key);
>> +	dev->parent = cxlds->dev;
>> +	dev->bus = &cxl_bus_type;
>> +	dev->devt = MKDEV(cxl_mem_major, cxlmd->id);
>> +	dev->type = &cxl_memdev_type;
>> +	device_set_pm_not_required(dev);
>> +	INIT_WORK(&cxlmd->detach_work, detach_memdev);
>>   
>> -err:
>> -	/*
>> -	 * The cdev was briefly live, shutdown any ioctl operations that
>> -	 * saw that state.
>> -	 */
>> -	cxl_memdev_shutdown(dev);
>> -	put_device(dev);
>> -	return ERR_PTR(rc);
>> +	cdev = &cxlmd->cdev;
>> +	cdev_init(cdev, &cxl_memdev_fops);
>> +	return_ptr(cxlmd);
>>   }
>> -EXPORT_SYMBOL_NS_GPL(devm_cxl_add_memdev, "CXL");
>> +EXPORT_SYMBOL_NS_GPL(cxl_memdev_alloc, "CXL");
>>   
>>   static void sanitize_teardown_notifier(void *data)
>>   {
>> diff --git a/drivers/cxl/mem.c b/drivers/cxl/mem.c
>> index d2155f45240d..fa5d901ee817 100644
>> --- a/drivers/cxl/mem.c
>> +++ b/drivers/cxl/mem.c
>> @@ -7,6 +7,7 @@
>>   
>>   #include "cxlmem.h"
>>   #include "cxlpci.h"
>> +#include "private.h"
>>   
>>   /**
>>    * DOC: cxl mem
>> @@ -202,6 +203,45 @@ static int cxl_mem_probe(struct device *dev)
>>   	return devm_add_action_or_reset(dev, enable_suspend, NULL);
>>   }
>>   
>> +static void __cxlmd_free(struct cxl_memdev *cxlmd)
>> +{
>> +	cxlmd->cxlds->cxlmd = NULL;
>> +	put_device(&cxlmd->dev);
>> +	kfree(cxlmd);
>> +}
>> +
>> +DEFINE_FREE(cxlmd_free, struct cxl_memdev *, __cxlmd_free(_T))
>> +
>> +/**
>> + * devm_cxl_add_memdev - Add a CXL memory device
>> + * @host: devres alloc/release context and parent for the memdev
>> + * @cxlds: CXL device state to associate with the memdev
>> + *
>> + * Upon return the device will have had a chance to attach to the
>> + * cxl_mem driver, but may fail if the CXL topology is not ready
>> + * (hardware CXL link down, or software platform CXL root not attached)
>> + */
>> +struct cxl_memdev *devm_cxl_add_memdev(struct device *host,
>> +				       struct cxl_dev_state *cxlds)
>> +{
>> +	struct cxl_memdev *cxlmd __free(cxlmd_free) = cxl_memdev_alloc(cxlds);
>> +	int rc;
>> +
>> +	if (IS_ERR(cxlmd))
>> +		return cxlmd;
>> +
>> +	rc = dev_set_name(&cxlmd->dev, "mem%d", cxlmd->id);
>> +	if (rc)
>> +		return ERR_PTR(rc);
>> +
>> +	rc = devm_cxl_memdev_add_or_reset(host, cxlmd);
>> +	if (rc)
>> +		return ERR_PTR(rc);
> Is the reference tracking right here?  If the above call fails
> then it is possible cxl_memdev_unregister() has been called
> or just cxl_memdev_shutdown().
>
> If nothing else (and I suspect there is worse but haven't
> counted references) that will set
> cxlmd->cxlds = NULL;
> s part of cxl_memdev_shutdown()
> The __cxlmd_free() then dereferences that and boom.
>
>
>> +
>> +	return no_free_ptr(cxlmd);
>> +}
>> +EXPORT_SYMBOL_NS_GPL(devm_cxl_add_memdev, "CXL");

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ