[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <74fbeca5-2c64-40a5-b399-621b8c9a1271@amd.com>
Date: Tue, 2 Dec 2025 08:47:54 +0000
From: Alejandro Lucero Palau <alucerop@....com>
To: dan.j.williams@...el.com, alejandro.lucero-palau@....com,
linux-cxl@...r.kernel.org, netdev@...r.kernel.org, edward.cree@....com,
davem@...emloft.net, kuba@...nel.org, pabeni@...hat.com,
edumazet@...gle.com, dave.jiang@...el.com
Subject: Re: [PATCH v21 01/23] cxl/mem: refactor memdev allocation
On 12/2/25 02:52, dan.j.williams@...el.com wrote:
> alejandro.lucero-palau@ wrote:
>> From: Alejandro Lucero <alucerop@....com>
>>
>> In preparation for always-synchronous memdev attach, refactor memdev
>> allocation and fix release bug in devm_cxl_add_memdev() when error after
>> a successful allocation.
> Never do "refactor and fix". Always do "fix" then "refactor" separately.
Ok.
> In this case though I wonder what release bug you are referring to?
>
> If cxl_memdev_alloc() fails, nothing to free.
>
> If dev_set_name() fails, it puts the device which calls
> cxl_memdev_release() which undoes cxl_memdev_alloc(). (Now, that weird
> and busted devm_cxl_memdev_edac_release() somehow snuck into
> cxl_memdev_release() when I was not looking. I will fix that separately,
> but no leak there that I can see.)
>
> If cdev_device_add() fails we need to shutdown the ioctl path, but
> otherwise put_device() cleans everything up.
>
> If the devm_add_action_or_reset() fails the device needs to be both
> unregistered and final put. It does not use device_unregister() because
> the cdev also needs to be deleted. So cdev_device_del() handles the
> device_del() and the caller is responsible for the final put_device().
>
> What bug are you referring to?
You are right. I was missing the release from cxl_memdev_type linked to
put_device.
I guess I got confused with devm and __free approaches ...
>
>> The diff is busy as this moves cxl_memdev_alloc() down below the definition
>> of cxl_memdev_fops and introduces devm_cxl_memdev_add_or_reset() to
>> preclude needing to export more symbols from the cxl_core.
> Will need to read the code to figure out what this patch is trying to do
> because this changelog is not orienting me to the problem that is being
> solved.
>
>> Fixes: 1c3333a28d45 ("cxl/mem: Do not rely on device_add() side effects for dev_set_name() failures")
> Maybe this Fixes: tag is wrong and this is instead a bug introduced by
> my probe order RFC? At least Jonathan pinged me about a bug there that I
> will go look at next.
This fixes tag is wrong due what you pointed out above.
Not sure what you/Jonathan are referring to here. PJ found a problem
with cyclic module dependencies with the changes introduced by these two
first patches.
It can be solved changing CXL _BUS config from tristate to bool ... what
PJ tried successfully. I was expecting some comments before adding it to
next patchset version ...
>
>> Signed-off-by: Dan Williams <dan.j.williams@...el.com>
> Why does this have my Sign-off?
It was your original patch.
>> Signed-off-by: Alejandro Lucero <alucerop@....com>
>> ---
>> drivers/cxl/core/memdev.c | 134 +++++++++++++++++++++-----------------
>> drivers/cxl/private.h | 10 +++
>> 2 files changed, 86 insertions(+), 58 deletions(-)
>> create mode 100644 drivers/cxl/private.h
>>
>> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
>> index e370d733e440..8de19807ac7b 100644
>> --- a/drivers/cxl/core/memdev.c
>> +++ b/drivers/cxl/core/memdev.c
>> @@ -8,6 +8,7 @@
>> #include <linux/idr.h>
>> #include <linux/pci.h>
>> #include <cxlmem.h>
>> +#include "private.h"
>> #include "trace.h"
>> #include "core.h"
>>
>> @@ -648,42 +649,25 @@ static void detach_memdev(struct work_struct *work)
>>
>> static struct lock_class_key cxl_memdev_key;
>>
>> -static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
>> - const struct file_operations *fops)
>> +int devm_cxl_memdev_add_or_reset(struct device *host, struct cxl_memdev *cxlmd)
> Can you say more why Type-2 drivers need an "_or_reset()" export? If a
> Type-2 driver is calling devm_cxl_add_memdev() from its ->probe()
> routine, then just return on failure. Confused.
Well, maybe it is you who should answer that question. It comes from
something you suggested I should use for solving problems with Type2 and
potential module removal. I added those patches first time two months
ago and now you are finally commenting on it.
This is the little story: my comments suggesting how I think we should
deal with that problem were ignored, then you suddenly commented in and
offer your way of solving it pointing to your branch. I used and tested
it which indeed fixed those potential removals ... I work on them for
solving some minor issues then Jonathan suggests to refactor the first
patch. I think I found a problem with the allocation ... I tried to
solve it ... I kept the original commit as you were the one proposing it
and you are a native english speaker ... you realized in the next patch
review those are indeed your work on solving the problem ... then you
propose another patch ...
I really hope you review all this in the impending v22 where I will
present a solution for the Type2 initialization when HDM committed by
firmware.
Powered by blists - more mailing lists