[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <56186794-e514-e606-8a3e-1b73bdee7bae@amd.com>
Date: Mon, 20 Jan 2025 17:27:07 +0000
From: Alejandro Lucero Palau <alucerop@....com>
To: Dan Williams <dan.j.williams@...el.com>, alejandro.lucero-palau@....com,
linux-cxl@...r.kernel.org, netdev@...r.kernel.org, edward.cree@....com,
davem@...emloft.net, kuba@...nel.org, pabeni@...hat.com,
edumazet@...gle.com, dave.jiang@...el.com
Subject: Re: [PATCH v9 14/27] sfc: create type2 cxl memdev
On 1/18/25 02:41, Dan Williams wrote:
> alejandro.lucero-palau@ wrote:
>> From: Alejandro Lucero <alucerop@....com>
>>
>> Use cxl API for creating a cxl memory device using the type2
>> cxl_dev_state struct.
>>
>> Signed-off-by: Alejandro Lucero <alucerop@....com>
>> Reviewed-by: Martin Habets <habetsm.xilinx@...il.com>
>> Reviewed-by: Fan Ni <fan.ni@...sung.com>
>> Acked-by: Edward Cree <ecree.xilinx@...il.com>
>> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@...wei.com>
>> ---
>> drivers/net/ethernet/sfc/efx_cxl.c | 9 +++++++++
>> 1 file changed, 9 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
>> index 911f29b91bd3..f4bf137fd878 100644
>> --- a/drivers/net/ethernet/sfc/efx_cxl.c
>> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
>> @@ -96,10 +96,19 @@ int efx_cxl_init(struct efx_probe_data *probe_data)
>> */
>> cxl_set_media_ready(cxl->cxlds);
> It is unfortunate the media_ready is just being used as an
> offline/online flag for memdevs. I would be open to just switching to
> typical device online/offline semantics and drop the "media_ready" flag.
Not sure I understand the semantics here. Note our device, as the
related register being optional, has no way for asking the HW about this
(by CXL means), mainly because it is not necessary. I guess this
hardware state is there because it can be needed, but again, not sure I
understand media-ready vs memory online.
>
>>
>> + cxl->cxlmd = devm_cxl_add_memdev(&pci_dev->dev, cxl->cxlds);
>> + if (IS_ERR(cxl->cxlmd)) {
>> + pci_err(pci_dev, "CXL accel memdev creation failed");
>> + rc = PTR_ERR(cxl->cxlmd);
>> + goto err_memdev;
>> + }
>> +
>> probe_data->cxl = cxl;
>>
>> return 0;
>>
>> +err_memdev:
>> + cxl_release_resource(cxl->cxlds, CXL_RES_RAM);
>> err_resource_set:
>> kfree(cxl->cxlds);
> In general a function should not mix devm and goto as that is a recipe
> for bugs.
>
> The bug I see here is that devm_cxl_add_memdev() runs the teardown flow
> *after* efx_pci_probe() returns an error code. That happens in
> device_unbind_cleanup(), but when it goes to cleanup endpoint decoders
> and anything else that might reference @cxlds it crashes because @cxlds
> is long gone.
>
> So if you use devm_cxl_add_memdev() then cxlds must be devm allocated as
> well to make sure it gets freed in the proper reverse order.
That is true and I think it is easy to fix, even with your changes for
patch 1.
I'll show it in v10.
Thanks!
Powered by blists - more mailing lists