[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d1beb594-3bd5-b2c7-9753-4c1d8aeb52db@amd.com>
Date: Tue, 14 Jan 2025 16:40:51 +0000
From: Alejandro Lucero Palau <alucerop@....com>
To: Dan Williams <dan.j.williams@...el.com>, alejandro.lucero-palau@....com,
linux-cxl@...r.kernel.org, netdev@...r.kernel.org, martin.habets@...inx.com,
edward.cree@....com, davem@...emloft.net, kuba@...nel.org,
pabeni@...hat.com, edumazet@...gle.com, dave.jiang@...el.com
Subject: Re: [PATCH v8 01/27] cxl: add type2 device basic support
On 1/14/25 14:35, Alejandro Lucero Palau wrote:
>
> On 1/8/25 14:32, Alejandro Lucero Palau wrote:
>>
>> On 1/8/25 01:33, Dan Williams wrote:
>>> Dan Williams wrote:
>>>> alejandro.lucero-palau@ wrote:
>>>>> From: Alejandro Lucero <alucerop@....com>
>>>>>
>>>>> Differentiate CXL memory expanders (type 3) from CXL device
>>>>> accelerators
>>>>> (type 2) with a new function for initializing cxl_dev_state.
>>>>>
>>>>> Create accessors to cxl_dev_state to be used by accel drivers.
>>>>>
>>>>> Based on previous work by Dan Williams [1]
>>>>>
>>>>> Link: [1]
>>>>> https://lore.kernel.org/linux-cxl/168592160379.1948938.12863272903570476312.stgit@dwillia2-xfh.jf.intel.com/
>>>>> Signed-off-by: Alejandro Lucero <alucerop@....com>
>>>>> Co-developed-by: Dan Williams <dan.j.williams@...el.com>
>>>>> Reviewed-by: Dave Jiang <dave.jiang@...el.com>
>>>>> Reviewed-by: Fan Ni <fan.ni@...sung.com>
>>>> This patch causes
>>> Whoops, forgot to complete this thought. Someting in this series
>>> causes:
>>>
>>> depmod: ERROR: Cycle detected: ecdh_generic
>>> depmod: ERROR: Cycle detected: tpm
>>> depmod: ERROR: Cycle detected: cxl_mock -> cxl_core -> cxl_mock
>>> depmod: ERROR: Cycle detected: encrypted_keys
>>> depmod: ERROR: Found 2 modules in dependency cycles!
>>>
>>> I think the non CXL ones are false likely triggered by the CXL causing
>>> depmod to exit early.
>>>
>>> Given cxl-test is unfamiliar territory to many submitters I always
>>> offer
>>> to fix up the breakage. I came up with the below incremental patch to
>>> fold in that also addresses my other feedback.
>>>
>>> Now the depmod error is something Alison saw too, and while I can also
>>> see it on patch1 if I do:
>>>
>>> - apply whole series
>>> - build => see the error
>>> - rollback patch1
>>> - build => see the error
>>>
>>> ...a subsequent build the error goes away, so I think that transient
>>> behavior is a quirk of how cxl-test is built, but some later patch in
>>> that series makes the failure permanent.
>>>
>>> In any event I figured that out after creating the below fixup and
>>> realizing that it does not fix the cxl-test build issue:
>>
>>
>> Ok. but it is a good way of showing what you had in your mind about
>> the suggested changes.
>>
>> I'll use it for v10.
>>
>> Thanks
>>
>
> Hi Dan,
>
>
> There's a problem with this approach and it is the need of the driver
> having access to internal cxl structs like cxl_dev_state.
>
> Your patch does not cover it but for an accel driver that struct needs
> to be allocated before using the new cxl_dev_state_init.
>
> I think we reached an agreement in initial discussions about avoiding
> this need through an API for accel drivers indirectly doing whatever
> is needed regarding internal CXL structs. Initially it was stated this
> being necessary for avoiding drivers doing wrong things but Jonathan
> pointed out the main concern being changing those internal structs in
> the future could benefit from this approach. Whatever the reason, that
> was the assumption.
>
>
> I could add a function for accel drivers doing the allocation as with
> current v9 code, and then using your changes for having common code.
>
>
> Also, I completely agree with merging the serial and dvsec
> initializations through arguments to cxl_dev_state_init, but we need
> the cxl_set_resource function for accel drivers. The current code for
> adding resources with memdev is relying on mbox commands, and although
> we could change that code for supporting accel drivers without an
> mbox, I would say the function/code added is simple enough for not
> requiring that effort. Note my goal is for an accel device without an
> mbox, but we will see devices with one in the future, so I bet for
> leaving any change there to that moment.
>
I forgot to say that I will send the code setting the resources in
another patch, as this seems to be what you prefer, assuming you agree
with my previous comment.
> Let me know what you think about these two things. I would like to
> send v10 this week.
>
>
> Thank you
>
>
>>
>>>
>>> -- 8< --
>>> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
>>> index 548564c770c0..584766d34b05 100644
>>> --- a/drivers/cxl/core/mbox.c
>>> +++ b/drivers/cxl/core/mbox.c
>>> @@ -1435,7 +1435,7 @@ int cxl_mailbox_init(struct cxl_mailbox
>>> *cxl_mbox, struct device *host)
>>> }
>>> EXPORT_SYMBOL_NS_GPL(cxl_mailbox_init, "CXL");
>>> -struct cxl_memdev_state *cxl_memdev_state_create(struct device *dev)
>>> +struct cxl_memdev_state *cxl_memdev_state_create(struct device
>>> *dev, u64 serial, u16 dvsec)
>>> {
>>> struct cxl_memdev_state *mds;
>>> @@ -1445,11 +1445,9 @@ struct cxl_memdev_state
>>> *cxl_memdev_state_create(struct device *dev)
>>> return ERR_PTR(-ENOMEM);
>>> }
>>> + cxl_dev_state_init(&mds->cxlds, dev, CXL_DEVTYPE_CLASSMEM,
>>> serial,
>>> + dvsec);
>>> mutex_init(&mds->event.log_lock);
>>> - mds->cxlds.dev = dev;
>>> - mds->cxlds.reg_map.host = dev;
>>> - mds->cxlds.reg_map.resource = CXL_RESOURCE_NONE;
>>> - mds->cxlds.type = CXL_DEVTYPE_CLASSMEM;
>>> mds->ram_perf.qos_class = CXL_QOS_CLASS_INVALID;
>>> mds->pmem_perf.qos_class = CXL_QOS_CLASS_INVALID;
>>> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
>>> index 99f533caae1e..9b8b9b4d1392 100644
>>> --- a/drivers/cxl/core/memdev.c
>>> +++ b/drivers/cxl/core/memdev.c
>>> @@ -617,24 +617,18 @@ static void detach_memdev(struct work_struct
>>> *work)
>>> static struct lock_class_key cxl_memdev_key;
>>> -struct cxl_dev_state *cxl_accel_state_create(struct device *dev)
>>> +void cxl_dev_state_init(struct cxl_dev_state *cxlds, struct device
>>> *dev,
>>> + enum cxl_devtype type, u64 serial, u16 dvsec)
>>> {
>>> - struct cxl_dev_state *cxlds;
>>> -
>>> - cxlds = kzalloc(sizeof(*cxlds), GFP_KERNEL);
>>> - if (!cxlds)
>>> - return ERR_PTR(-ENOMEM);
>>> -
>>> cxlds->dev = dev;
>>> - cxlds->type = CXL_DEVTYPE_DEVMEM;
>>> + cxlds->type = type;
>>> + cxlds->reg_map.host = dev;
>>> + cxlds->reg_map.resource = CXL_RESOURCE_NONE;
>>> cxlds->dpa_res = DEFINE_RES_MEM_NAMED(0, 0, "dpa");
>>> cxlds->ram_res = DEFINE_RES_MEM_NAMED(0, 0, "ram");
>>> cxlds->pmem_res = DEFINE_RES_MEM_NAMED(0, 0, "pmem");
>>> -
>>> - return cxlds;
>>> }
>>> -EXPORT_SYMBOL_NS_GPL(cxl_accel_state_create, "CXL");
>>> static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state
>>> *cxlds,
>>> const struct file_operations *fops)
>>> @@ -713,37 +707,6 @@ static int cxl_memdev_open(struct inode *inode,
>>> struct file *file)
>>> return 0;
>>> }
>>> -void cxl_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec)
>>> -{
>>> - cxlds->cxl_dvsec = dvsec;
>>> -}
>>> -EXPORT_SYMBOL_NS_GPL(cxl_set_dvsec, "CXL");
>>> -
>>> -void cxl_set_serial(struct cxl_dev_state *cxlds, u64 serial)
>>> -{
>>> - cxlds->serial = serial;
>>> -}
>>> -EXPORT_SYMBOL_NS_GPL(cxl_set_serial, "CXL");
>>> -
>>> -int cxl_set_resource(struct cxl_dev_state *cxlds, struct resource res,
>>> - enum cxl_resource type)
>>> -{
>>> - switch (type) {
>>> - case CXL_RES_DPA:
>>> - cxlds->dpa_res = res;
>>> - return 0;
>>> - case CXL_RES_RAM:
>>> - cxlds->ram_res = res;
>>> - return 0;
>>> - case CXL_RES_PMEM:
>>> - cxlds->pmem_res = res;
>>> - return 0;
>>> - }
>>> -
>>> - return -EINVAL;
>>> -}
>>> -EXPORT_SYMBOL_NS_GPL(cxl_set_resource, "CXL");
>>> -
>>> static int cxl_memdev_release_file(struct inode *inode, struct
>>> file *file)
>>> {
>>> struct cxl_memdev *cxlmd =
>>> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h
>>> index 2a25d1957ddb..1e4b64b8f35a 100644
>>> --- a/drivers/cxl/cxlmem.h
>>> +++ b/drivers/cxl/cxlmem.h
>>> @@ -4,6 +4,7 @@
>>> #define __CXL_MEM_H__
>>> #include <uapi/linux/cxl_mem.h>
>>> #include <linux/pci.h>
>>> +#include <cxl/cxl.h>
>>> #include <linux/cdev.h>
>>> #include <linux/uuid.h>
>>> #include <linux/node.h>
>>> @@ -380,20 +381,6 @@ struct cxl_security_state {
>>> struct kernfs_node *sanitize_node;
>>> };
>>> -/*
>>> - * enum cxl_devtype - delineate type-2 from a generic type-3 device
>>> - * @CXL_DEVTYPE_DEVMEM - Vendor specific CXL Type-2 device
>>> implementing HDM-D or
>>> - * HDM-DB, no requirement that this device implements a
>>> - * mailbox, or other memory-device-standard manageability
>>> - * flows.
>>> - * @CXL_DEVTYPE_CLASSMEM - Common class definition of a CXL Type-3
>>> device with
>>> - * HDM-H and class-mandatory memory device registers
>>> - */
>>> -enum cxl_devtype {
>>> - CXL_DEVTYPE_DEVMEM,
>>> - CXL_DEVTYPE_CLASSMEM,
>>> -};
>>> -
>>> /**
>>> * struct cxl_dpa_perf - DPA performance property entry
>>> * @dpa_range: range for DPA address
>>> @@ -411,9 +398,9 @@ struct cxl_dpa_perf {
>>> /**
>>> * struct cxl_dev_state - The driver device state
>>> *
>>> - * cxl_dev_state represents the CXL driver/device state. It
>>> provides an
>>> - * interface to mailbox commands as well as some cached data about
>>> the device.
>>> - * Currently only memory devices are represented.
>>> + * cxl_dev_state represents the minimal data about a CXL device to
>>> allow
>>> + * the CXL core to manage common initialization of generic CXL and
>>> HDM capabilities of
>>> + * memory expanders and accelerators with device-memory
>>> *
>>> * @dev: The device associated with this CXL state
>>> * @cxlmd: The device representing the CXL.mem capabilities of @dev
>>> @@ -426,7 +413,7 @@ struct cxl_dpa_perf {
>>> * @pmem_res: Active Persistent memory capacity configuration
>>> * @ram_res: Active Volatile memory capacity configuration
>>> * @serial: PCIe Device Serial Number
>>> - * @type: Generic Memory Class device or Vendor Specific Memory device
>>> + * @type: Generic Memory Class device or an accelerator with CXL.mem
>>> * @cxl_mbox: CXL mailbox context
>>> */
>>> struct cxl_dev_state {
>>> @@ -819,7 +806,8 @@ int cxl_dev_state_identify(struct
>>> cxl_memdev_state *mds);
>>> int cxl_await_media_ready(struct cxl_dev_state *cxlds);
>>> int cxl_enumerate_cmds(struct cxl_memdev_state *mds);
>>> int cxl_mem_create_range_info(struct cxl_memdev_state *mds);
>>> -struct cxl_memdev_state *cxl_memdev_state_create(struct device *dev);
>>> +struct cxl_memdev_state *cxl_memdev_state_create(struct device
>>> *dev, u64 serial,
>>> + u16 dvsec);
>>> void set_exclusive_cxl_commands(struct cxl_memdev_state *mds,
>>> unsigned long *cmds);
>>> void clear_exclusive_cxl_commands(struct cxl_memdev_state *mds,
>>> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
>>> index 36098e2b4235..b51e47fd28b3 100644
>>> --- a/drivers/cxl/pci.c
>>> +++ b/drivers/cxl/pci.c
>>> @@ -922,21 +922,19 @@ static int cxl_pci_probe(struct pci_dev *pdev,
>>> const struct pci_device_id *id)
>>> return rc;
>>> pci_set_master(pdev);
>>> - mds = cxl_memdev_state_create(&pdev->dev);
>>> - if (IS_ERR(mds))
>>> - return PTR_ERR(mds);
>>> - cxlds = &mds->cxlds;
>>> - pci_set_drvdata(pdev, cxlds);
>>> -
>>> - cxlds->rcd = is_cxl_restricted(pdev);
>>> - cxl_set_serial(cxlds, pci_get_dsn(pdev));
>>> dvsec = pci_find_dvsec_capability(pdev, PCI_VENDOR_ID_CXL,
>>> CXL_DVSEC_PCIE_DEVICE);
>>> if (!dvsec)
>>> dev_warn(&pdev->dev,
>>> "Device DVSEC not present, skip CXL.mem init\n");
>>> - cxl_set_dvsec(cxlds, dvsec);
>>> + mds = cxl_memdev_state_create(&pdev->dev, pci_get_dsn(pdev),
>>> dvsec);
>>> + if (IS_ERR(mds))
>>> + return PTR_ERR(mds);
>>> + cxlds = &mds->cxlds;
>>> + pci_set_drvdata(pdev, cxlds);
>>> +
>>> + cxlds->rcd = is_cxl_restricted(pdev);
>>> rc = cxl_pci_setup_regs(pdev, CXL_REGLOC_RBI_MEMDEV, &map);
>>> if (rc)
>>> diff --git a/include/cxl/cxl.h b/include/cxl/cxl.h
>>> index aa4480d49e48..9db4fb6d2c74 100644
>>> --- a/include/cxl/cxl.h
>>> +++ b/include/cxl/cxl.h
>>> @@ -4,21 +4,25 @@
>>> #ifndef __CXL_H
>>> #define __CXL_H
>>> -#include <linux/ioport.h>
>>> +#include <linux/types.h>
>>> -enum cxl_resource {
>>> - CXL_RES_DPA,
>>> - CXL_RES_RAM,
>>> - CXL_RES_PMEM,
>>> +/*
>>> + * enum cxl_devtype - delineate type-2 from a generic type-3 device
>>> + * @CXL_DEVTYPE_DEVMEM - Vendor specific CXL Type-2 device
>>> implementing HDM-D or
>>> + * HDM-DB, no requirement that this device implements a
>>> + * mailbox, or other memory-device-standard manageability
>>> + * flows.
>>> + * @CXL_DEVTYPE_CLASSMEM - Common class definition of a CXL Type-3
>>> device with
>>> + * HDM-H and class-mandatory memory device registers
>>> + */
>>> +enum cxl_devtype {
>>> + CXL_DEVTYPE_DEVMEM,
>>> + CXL_DEVTYPE_CLASSMEM,
>>> };
>>> struct cxl_dev_state;
>>> struct device;
>>> -struct cxl_dev_state *cxl_accel_state_create(struct device *dev);
>>> -
>>> -void cxl_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec);
>>> -void cxl_set_serial(struct cxl_dev_state *cxlds, u64 serial);
>>> -int cxl_set_resource(struct cxl_dev_state *cxlds, struct resource res,
>>> - enum cxl_resource);
>>> +void cxl_dev_state_init(struct cxl_dev_state *cxlds, struct device
>>> *dev,
>>> + enum cxl_devtype type, u64 serial, u16 dvsec);
>>> #endif
>>> diff --git a/tools/testing/cxl/test/mem.c
>>> b/tools/testing/cxl/test/mem.c
>>> index 347c1e7b37bd..24cac1cc30f9 100644
>>> --- a/tools/testing/cxl/test/mem.c
>>> +++ b/tools/testing/cxl/test/mem.c
>>> @@ -1500,7 +1500,7 @@ static int cxl_mock_mem_probe(struct
>>> platform_device *pdev)
>>> if (rc)
>>> return rc;
>>> - mds = cxl_memdev_state_create(dev);
>>> + mds = cxl_memdev_state_create(dev, pdev->id, 0);
>>> if (IS_ERR(mds))
>>> return PTR_ERR(mds);
>>> @@ -1516,7 +1516,6 @@ static int cxl_mock_mem_probe(struct
>>> platform_device *pdev)
>>> mds->event.buf = (struct cxl_get_event_payload *)
>>> mdata->event_buf;
>>> INIT_DELAYED_WORK(&mds->security.poll_dwork,
>>> cxl_mockmem_sanitize_work);
>>> - cxlds->serial = pdev->id;
>>> if (is_rcd(pdev))
>>> cxlds->rcd = true;
Powered by blists - more mailing lists