lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e5a4836d-a405-5b12-62a7-e45b39fb12ad@amd.com>
Date: Fri, 19 Jul 2024 07:03:58 +0100
From: Alejandro Lucero Palau <alucerop@....com>
To: Dave Jiang <dave.jiang@...el.com>, alejandro.lucero-palau@....com,
 linux-cxl@...r.kernel.org, netdev@...r.kernel.org, dan.j.williams@...el.com,
 martin.habets@...inx.com, edward.cree@....com, davem@...emloft.net,
 kuba@...nel.org, pabeni@...hat.com, edumazet@...gle.com,
 richard.hughes@....com
Subject: Re: [PATCH v2 01/15] cxl: add type2 device basic support


On 7/19/24 00:12, Dave Jiang wrote:
>
> On 7/15/24 10:28 AM, alejandro.lucero-palau@....com wrote:
>> From: Alejandro Lucero <alucerop@....com>
>>
>> Differientiate Type3, aka memory expanders, from Type2, aka device
>> accelerators, with a new function for initializing cxl_dev_state.
>>
>> Create opaque struct to be used by accelerators relying on new access
>> functions in following patches.
>>
>> Add SFC ethernet network driver as the client.
>>
>> Based on https://lore.kernel.org/linux-cxl/168592149709.1948938.8663425987110396027.stgit@dwillia2-xfh.jf.intel.com/T/#m52543f85d0e41ff7b3063fdb9caa7e845b446d0e
>>
>> Signed-off-by: Alejandro Lucero <alucerop@....com>
>> Co-developed-by: Dan Williams <dan.j.williams@...el.com>
>> ---
>>   drivers/cxl/core/memdev.c             | 52 ++++++++++++++++++++++++++
>>   drivers/net/ethernet/sfc/Makefile     |  2 +-
>>   drivers/net/ethernet/sfc/efx.c        |  4 ++
>>   drivers/net/ethernet/sfc/efx_cxl.c    | 53 +++++++++++++++++++++++++++
>>   drivers/net/ethernet/sfc/efx_cxl.h    | 29 +++++++++++++++
>>   drivers/net/ethernet/sfc/net_driver.h |  4 ++
>>   include/linux/cxl_accel_mem.h         | 22 +++++++++++
>>   include/linux/cxl_accel_pci.h         | 23 ++++++++++++
> Maybe create an include/linux/cxl and then we can put headers in there.
>
>>   8 files changed, 188 insertions(+), 1 deletion(-)
>>   create mode 100644 drivers/net/ethernet/sfc/efx_cxl.c
>>   create mode 100644 drivers/net/ethernet/sfc/efx_cxl.h
>>   create mode 100644 include/linux/cxl_accel_mem.h
>>   create mode 100644 include/linux/cxl_accel_pci.h
>>
>> diff --git a/drivers/cxl/core/memdev.c b/drivers/cxl/core/memdev.c
>> index 0277726afd04..61b5d35b49e7 100644
>> --- a/drivers/cxl/core/memdev.c
>> +++ b/drivers/cxl/core/memdev.c
>> @@ -8,6 +8,7 @@
>>   #include <linux/idr.h>
>>   #include <linux/pci.h>
>>   #include <cxlmem.h>
>> +#include <linux/cxl_accel_mem.h>
>>   #include "trace.h"
>>   #include "core.h"
>>   
>> @@ -615,6 +616,25 @@ static void detach_memdev(struct work_struct *work)
>>   
>>   static struct lock_class_key cxl_memdev_key;
>>   
>> +struct cxl_dev_state *cxl_accel_state_create(struct device *dev)
>> +{
>> +	struct cxl_dev_state *cxlds;
>> +
>> +	cxlds = devm_kzalloc(dev, sizeof(*cxlds), GFP_KERNEL);
> Naked cxlds. Do you think you'll need an accel_dev_state to wrap around cxl_dev_state similar to cxl_memdev_state in order to store accel related information? I also wonder if 'struct cxl_dev_state' should be a public definition. Need to look at the rest of the patchset to circle back.
>

Not sure I understand your concern. Are you saying we need to introduce 
an cxl_accel_state struct? Fro my work and I guess from Dan's original 
patch, it seems it is not needed, although I have already raised my 
concerns about, maybe, current structs requiring a refactoring due to 
the optional capabilities for Type2.

Regarding if cxl_dev_state needs to be public, this patchet version 
defines it as opaque for addressing the concerns about accel drivers 
need to be "controlled".


>> +	if (!cxlds)
>> +		return ERR_PTR(-ENOMEM);
>> +
>> +	cxlds->dev = dev;
>> +	cxlds->type = CXL_DEVTYPE_DEVMEM;
>> +
>> +	cxlds->dpa_res = DEFINE_RES_MEM_NAMED(0, 0, "dpa");
>> +	cxlds->ram_res = DEFINE_RES_MEM_NAMED(0, 0, "ram");
>> +	cxlds->pmem_res = DEFINE_RES_MEM_NAMED(0, 0, "pmem");
>> +
>> +	return cxlds;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_accel_state_create, CXL);
> I do wonder if we should have a common device state init helper function to init all the common bits:
> int cxlds_init(struct *dev, enum cxl_devtype devtype)
>
>
>> +
>>   static struct cxl_memdev *cxl_memdev_alloc(struct cxl_dev_state *cxlds,
>>   					   const struct file_operations *fops)
>>   {
>> @@ -692,6 +712,38 @@ static int cxl_memdev_open(struct inode *inode, struct file *file)
>>   	return 0;
>>   }
>>   
>> +
>> +void cxl_accel_set_dvsec(struct cxl_dev_state *cxlds, u16 dvsec)
>> +{
>> +	cxlds->cxl_dvsec = dvsec;
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_accel_set_dvsec, CXL);
>> +
>> +void cxl_accel_set_serial(struct cxl_dev_state *cxlds, u64 serial)
>> +{
>> +	cxlds->serial= serial;
> Missing space before '='
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_accel_set_serial, CXL);
>> +
>> +void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
>> +			    enum accel_resource type)
>> +{
>> +	switch (type) {
>> +	case CXL_ACCEL_RES_DPA:
>> +		cxlds->dpa_res = res;
>> +		return;
>> +	case CXL_ACCEL_RES_RAM:
>> +		cxlds->ram_res = res;
>> +		return;
>> +	case CXL_ACCEL_RES_PMEM:
>> +		cxlds->pmem_res = res;
>> +		return;
>> +	default:
>> +		dev_err(cxlds->dev, "unkown resource type (%u)\n", type);
>> +	}
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_accel_set_resource, CXL);
>> +
>>   static int cxl_memdev_release_file(struct inode *inode, struct file *file)
>>   {
>>   	struct cxl_memdev *cxlmd =
>> diff --git a/drivers/net/ethernet/sfc/Makefile b/drivers/net/ethernet/sfc/Makefile
>> index 8f446b9bd5ee..e80c713c3b0c 100644
>> --- a/drivers/net/ethernet/sfc/Makefile
>> +++ b/drivers/net/ethernet/sfc/Makefile
>> @@ -7,7 +7,7 @@ sfc-y			+= efx.o efx_common.o efx_channels.o nic.o \
>>   			   mcdi_functions.o mcdi_filters.o mcdi_mon.o \
>>   			   ef100.o ef100_nic.o ef100_netdev.o \
>>   			   ef100_ethtool.o ef100_rx.o ef100_tx.o \
>> -			   efx_devlink.o
>> +			   efx_devlink.o efx_cxl.o
>>   sfc-$(CONFIG_SFC_MTD)	+= mtd.o
>>   sfc-$(CONFIG_SFC_SRIOV)	+= sriov.o ef10_sriov.o ef100_sriov.o ef100_rep.o \
>>                              mae.o tc.o tc_bindings.o tc_counters.o \
>> diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
>> index e9d9de8e648a..cb3f74d30852 100644
>> --- a/drivers/net/ethernet/sfc/efx.c
>> +++ b/drivers/net/ethernet/sfc/efx.c
>> @@ -33,6 +33,7 @@
>>   #include "selftest.h"
>>   #include "sriov.h"
>>   #include "efx_devlink.h"
>> +#include "efx_cxl.h"
>>   
>>   #include "mcdi_port_common.h"
>>   #include "mcdi_pcol.h"
>> @@ -899,6 +900,7 @@ static void efx_pci_remove(struct pci_dev *pci_dev)
>>   	efx_pci_remove_main(efx);
>>   
>>   	efx_fini_io(efx);
>> +
> stray blank line
>
>>   	pci_dbg(efx->pci_dev, "shutdown successful\n");
>>   
>>   	efx_fini_devlink_and_unlock(efx);
>> @@ -1109,6 +1111,8 @@ static int efx_pci_probe(struct pci_dev *pci_dev,
>>   	if (rc)
>>   		goto fail2;
>>   
>> +	efx_cxl_init(efx);
> No error checks? Does the device expect to work whether CXL is setup or not?
>

Right. The netdev functionality will not be jeopardized because CXL 
initialization errors. If it is all fine, the PIO buffers will be mapped 
using the created CXL region, if not, PIO buffers will be used mapping 
at specific BAR offset.


>> +
>>   	rc = efx_pci_probe_post_io(efx);
>>   	if (rc) {
>>   		/* On failure, retry once immediately.
>> diff --git a/drivers/net/ethernet/sfc/efx_cxl.c b/drivers/net/ethernet/sfc/efx_cxl.c
>> new file mode 100644
>> index 000000000000..4554dd7cca76
>> --- /dev/null
>> +++ b/drivers/net/ethernet/sfc/efx_cxl.c
>> @@ -0,0 +1,53 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/****************************************************************************
>> + * Driver for AMD network controllers and boards
>> + * Copyright (C) 2024, Advanced Micro Devices, Inc.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms of the GNU General Public License version 2 as published
>> + * by the Free Software Foundation, incorporated herein by reference.
>> + */
>> +
>> +
>> +#include <linux/pci.h>
>> +#include <linux/cxl_accel_mem.h>
>> +#include <linux/cxl_accel_pci.h>
>> +
>> +#include "net_driver.h"
>> +#include "efx_cxl.h"
>> +
>> +#define EFX_CTPIO_BUFFER_SIZE	(1024*1024*256)
>> +
>> +void efx_cxl_init(struct efx_nic *efx)
>> +{
>> +	struct pci_dev *pci_dev = efx->pci_dev;
>> +	struct efx_cxl *cxl = efx->cxl;
>> +	struct resource res;
>> +	u16 dvsec;
>> +
>> +	dvsec = pci_find_dvsec_capability(pci_dev, PCI_VENDOR_ID_CXL,
>> +					  CXL_DVSEC_PCIE_DEVICE);
>> +
>> +	if (!dvsec)
>> +		return;
>> +
>> +	pci_info(pci_dev, "CXL CXL_DVSEC_PCIE_DEVICE capability found");
> Seem like unnecessary kern log emission
>

Uhmm, yes, maybe something more linked to how PIO buffer end up being 
used at a later time.

>> +
>> +	cxl->cxlds = cxl_accel_state_create(&pci_dev->dev);
>> +	if (IS_ERR(cxl->cxlds)) {
>> +		pci_info(pci_dev, "CXL accel device state failed");
> pci_err()? or maybe pci_warn() given it's ignoring error returns.


Right. I will change this and other similar ones.


>> +		return;
>> +	}
>> +
>> +	cxl_accel_set_dvsec(cxl->cxlds, dvsec);
>> +	cxl_accel_set_serial(cxl->cxlds, pci_dev->dev.id);
>> +
>> +	res = DEFINE_RES_MEM(0, EFX_CTPIO_BUFFER_SIZE);
>> +	cxl_accel_set_resource(cxl->cxlds, res, CXL_ACCEL_RES_DPA);
>> +
>> +	res = DEFINE_RES_MEM_NAMED(0, EFX_CTPIO_BUFFER_SIZE, "ram");
>> +	cxl_accel_set_resource(cxl->cxlds, res, CXL_ACCEL_RES_RAM);
>> +}
>> +
>> +
>> +MODULE_IMPORT_NS(CXL);
>> diff --git a/drivers/net/ethernet/sfc/efx_cxl.h b/drivers/net/ethernet/sfc/efx_cxl.h
>> new file mode 100644
>> index 000000000000..76c6794c20d8
>> --- /dev/null
>> +++ b/drivers/net/ethernet/sfc/efx_cxl.h
>> @@ -0,0 +1,29 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/****************************************************************************
>> + * Driver for AMD network controllers and boards
>> + * Copyright (C) 2024, Advanced Micro Devices, Inc.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms of the GNU General Public License version 2 as published
>> + * by the Free Software Foundation, incorporated herein by reference.
>> + */
>> +
>> +#ifndef EFX_CXL_H
>> +#define EFX_CLX_H
>> +
>> +#include <linux/cxl_accel_mem.h>
>> +
>> +struct efx_nic;
>> +
>> +struct efx_cxl {
>> +	cxl_accel_state *cxlds;
>> +	struct cxl_memdev *cxlmd;
>> +	struct cxl_root_decoder *cxlrd;
>> +	struct cxl_port *endpoint;
>> +	struct cxl_endpoint_decoder *cxled;
>> +	struct cxl_region *efx_region;
>> +	void __iomem *ctpio_cxl;
>> +};
>> +
>> +void efx_cxl_init(struct efx_nic *efx);
>> +#endif
>> diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
>> index f2dd7feb0e0c..58b7517afea4 100644
>> --- a/drivers/net/ethernet/sfc/net_driver.h
>> +++ b/drivers/net/ethernet/sfc/net_driver.h
>> @@ -814,6 +814,8 @@ enum efx_xdp_tx_queues_mode {
>>   
>>   struct efx_mae;
>>   
>> +struct efx_cxl;
>> +
>>   /**
>>    * struct efx_nic - an Efx NIC
>>    * @name: Device name (net device name or bus id before net device registered)
>> @@ -962,6 +964,7 @@ struct efx_mae;
>>    * @tc: state for TC offload (EF100).
>>    * @devlink: reference to devlink structure owned by this device
>>    * @dl_port: devlink port associated with the PF
>> + * @cxl: details of related cxl objects
>>    * @mem_bar: The BAR that is mapped into membase.
>>    * @reg_base: Offset from the start of the bar to the function control window.
>>    * @monitor_work: Hardware monitor workitem
>> @@ -1148,6 +1151,7 @@ struct efx_nic {
>>   
>>   	struct devlink *devlink;
>>   	struct devlink_port *dl_port;
>> +	struct efx_cxl *cxl;
>>   	unsigned int mem_bar;
>>   	u32 reg_base;
>>   
>> diff --git a/include/linux/cxl_accel_mem.h b/include/linux/cxl_accel_mem.h
>> new file mode 100644
>> index 000000000000..daf46d41f59c
>> --- /dev/null
>> +++ b/include/linux/cxl_accel_mem.h
>> @@ -0,0 +1,22 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/* Copyright(c) 2024 Advanced Micro Devices, Inc. */
>> +
>> +#include <linux/cdev.h>
> Don't think this header is needed?
>
>> +
>> +#ifndef __CXL_ACCEL_MEM_H
>> +#define __CXL_ACCEL_MEM_H
>> +
>> +enum accel_resource{
>> +	CXL_ACCEL_RES_DPA,
>> +	CXL_ACCEL_RES_RAM,
>> +	CXL_ACCEL_RES_PMEM,
>> +};
>> +
>> +typedef struct cxl_dev_state cxl_accel_state;
> Please use 'struct cxl_dev_state' directly. There's no good reason to hide the type.


That is what I think I was told to do although not explicitly. There 
were concerns in the RFC about accel drivers too loose for doing things 
regarding CXL and somehow CXL core should keep control as much as 
possible.  I was even thought I was being asked to implement auxbus with 
the CXL part of an accel as an auxiliar device which should be bound to 
a CXL core driver. Then Jonathan Cameron the only one explicitly giving 
the possibility of the opaque approach and disadvising the auxbus idea.


Maybe I need an explicit action here.


>> +cxl_accel_state *cxl_accel_state_create(struct device *dev);
>> +
>> +void cxl_accel_set_dvsec(cxl_accel_state *cxlds, u16 dvsec);
>> +void cxl_accel_set_serial(cxl_accel_state *cxlds, u64 serial);
>> +void cxl_accel_set_resource(struct cxl_dev_state *cxlds, struct resource res,
>> +			    enum accel_resource);
>> +#endif
>> diff --git a/include/linux/cxl_accel_pci.h b/include/linux/cxl_accel_pci.h
>> new file mode 100644
>> index 000000000000..c337ae8797e6
>> --- /dev/null
>> +++ b/include/linux/cxl_accel_pci.h
>> @@ -0,0 +1,23 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/* Copyright(c) 2024 Advanced Micro Devices, Inc. */
>> +
>> +#ifndef __CXL_ACCEL_PCI_H
>> +#define __CXL_ACCEL_PCI_H
>> +
>> +/* CXL 2.0 8.1.3: PCIe DVSEC for CXL Device */
>> +#define CXL_DVSEC_PCIE_DEVICE					0
>> +#define   CXL_DVSEC_CAP_OFFSET		0xA
>> +#define     CXL_DVSEC_MEM_CAPABLE	BIT(2)
>> +#define     CXL_DVSEC_HDM_COUNT_MASK	GENMASK(5, 4)
>> +#define   CXL_DVSEC_CTRL_OFFSET		0xC
>> +#define     CXL_DVSEC_MEM_ENABLE	BIT(2)
>> +#define   CXL_DVSEC_RANGE_SIZE_HIGH(i)	(0x18 + (i * 0x10))
>> +#define   CXL_DVSEC_RANGE_SIZE_LOW(i)	(0x1C + (i * 0x10))
>> +#define     CXL_DVSEC_MEM_INFO_VALID	BIT(0)
>> +#define     CXL_DVSEC_MEM_ACTIVE	BIT(1)
>> +#define     CXL_DVSEC_MEM_SIZE_LOW_MASK	GENMASK(31, 28)
>> +#define   CXL_DVSEC_RANGE_BASE_HIGH(i)	(0x20 + (i * 0x10))
>> +#define   CXL_DVSEC_RANGE_BASE_LOW(i)	(0x24 + (i * 0x10))
>> +#define     CXL_DVSEC_MEM_BASE_LOW_MASK	GENMASK(31, 28)
> This looks like a copy/paste of drivers/cxl/cxlpci.h definition. I suggest create a include/linux/cxl/pci.h and stick it in there and delete the copy in cxlpci.h. Also update the CXL spec version to latest (3.1) if you don't mind if we are going to move it.


That makes sense. I'll do it.

Thanks


>> +
>> +#endif

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ