lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cfae046e-106d-4963-88be-8ca116859538@intel.com>
Date: Fri, 19 Sep 2025 15:00:29 +0200
From: Przemek Kitszel <przemyslaw.kitszel@...el.com>
To: Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
CC: Andrew Lunn <andrew+netdev@...n.ch>, "David S. Miller"
	<davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski
	<kuba@...nel.org>, <netdev@...r.kernel.org>, Paolo Abeni <pabeni@...hat.com>,
	Wen Gu <guwen@...ux.alibaba.com>, Philo Lu <lulie@...ux.alibaba.com>,
	"Lorenzo Bianconi" <lorenzo@...nel.org>, Vadim Fedorenko
	<vadim.fedorenko@...ux.dev>, Lukas Bulwahn <lukas.bulwahn@...hat.com>, Geert
 Uytterhoeven <geert+renesas@...der.be>, Vivian Wang
	<wangruikang@...as.ac.cn>, "Troy Mitchell"
	<troy.mitchell@...ux.spacemit.com>, Dust Li <dust.li@...ux.alibaba.com>
Subject: Re: [PATCH net-next v3] eea: Add basic driver framework for Alibaba
 Elastic Ethernet Adaptor

On 9/19/25 03:48, Xuan Zhuo wrote:
> Add a driver framework for EEA that will be available in the future.
> 
> This driver is currently quite minimal, implementing only fundamental
> core functionalities. Key features include: I/O queue management via
> adminq, basic PCI-layer operations, and essential RX/TX data
> communication capabilities. It also supports the creation,
> initialization, and management of network devices (netdev). Furthermore,
> the ring structures for both I/O queues and adminq have been abstracted
> into a simple, unified, and reusable library implementation,
> facilitating future extension and maintenance.
> 
> This commit is indeed quite large, but further splitting it would not be
> meaningful. Historically, many similar drivers have been introduced with
> commits of similar size and scope, so we chose not to invest excessive
> effort into finer-grained splitting.
> 
> Reviewed-by: Dust Li <dust.li@...ux.alibaba.com>
> Reviewed-by: Philo Lu <lulie@...ux.alibaba.com>
> Signed-off-by: Wen Gu <guwen@...ux.alibaba.com>
> Signed-off-by: Xuan Zhuo <xuanzhuo@...ux.alibaba.com>

if, by any chance, there were more than you developing this
driver, it is the best moment to give some credit via Co-developed-by
tags to major contributors

please find my feedback inline, sorry for it being in random places,
otherwise it would take a whole day
(but please apply fix for each type of issue per whole series)

> ---
> 
> v3: Thanks for the comments from Paolo Abenchi
> v2: Thanks for the comments from Simon Horman and Andrew Lunn
> v1: Thanks for the comments from Simon Horman and Andrew Lunn
> 
>   MAINTAINERS                                   |   8 +
>   drivers/net/ethernet/Kconfig                  |   1 +
>   drivers/net/ethernet/Makefile                 |   1 +
>   drivers/net/ethernet/alibaba/Kconfig          |  29 +
>   drivers/net/ethernet/alibaba/Makefile         |   5 +
>   drivers/net/ethernet/alibaba/eea/Makefile     |   9 +
>   drivers/net/ethernet/alibaba/eea/eea_adminq.c | 452 ++++++++++
>   drivers/net/ethernet/alibaba/eea/eea_adminq.h |  70 ++
>   drivers/net/ethernet/alibaba/eea/eea_desc.h   | 155 ++++
>   .../net/ethernet/alibaba/eea/eea_ethtool.c    | 310 +++++++
>   .../net/ethernet/alibaba/eea/eea_ethtool.h    |  51 ++
>   drivers/net/ethernet/alibaba/eea/eea_net.c    | 575 +++++++++++++
>   drivers/net/ethernet/alibaba/eea/eea_net.h    | 196 +++++
>   drivers/net/ethernet/alibaba/eea/eea_pci.c    | 574 +++++++++++++
>   drivers/net/ethernet/alibaba/eea/eea_pci.h    |  67 ++
>   drivers/net/ethernet/alibaba/eea/eea_ring.c   | 267 ++++++
>   drivers/net/ethernet/alibaba/eea/eea_ring.h   |  89 ++
>   drivers/net/ethernet/alibaba/eea/eea_rx.c     | 784 ++++++++++++++++++
>   drivers/net/ethernet/alibaba/eea/eea_tx.c     | 405 +++++++++
>   19 files changed, 4048 insertions(+)
>   create mode 100644 drivers/net/ethernet/alibaba/Kconfig
>   create mode 100644 drivers/net/ethernet/alibaba/Makefile
>   create mode 100644 drivers/net/ethernet/alibaba/eea/Makefile
>   create mode 100644 drivers/net/ethernet/alibaba/eea/eea_adminq.c
>   create mode 100644 drivers/net/ethernet/alibaba/eea/eea_adminq.h
>   create mode 100644 drivers/net/ethernet/alibaba/eea/eea_desc.h
>   create mode 100644 drivers/net/ethernet/alibaba/eea/eea_ethtool.c
>   create mode 100644 drivers/net/ethernet/alibaba/eea/eea_ethtool.h
>   create mode 100644 drivers/net/ethernet/alibaba/eea/eea_net.c
>   create mode 100644 drivers/net/ethernet/alibaba/eea/eea_net.h
>   create mode 100644 drivers/net/ethernet/alibaba/eea/eea_pci.c
>   create mode 100644 drivers/net/ethernet/alibaba/eea/eea_pci.h
>   create mode 100644 drivers/net/ethernet/alibaba/eea/eea_ring.c
>   create mode 100644 drivers/net/ethernet/alibaba/eea/eea_ring.h
>   create mode 100644 drivers/net/ethernet/alibaba/eea/eea_rx.c
>   create mode 100644 drivers/net/ethernet/alibaba/eea/eea_tx.c
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index a8a770714101..9ffc6a753842 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -789,6 +789,14 @@ S:	Maintained
>   F:	Documentation/i2c/busses/i2c-ali1563.rst
>   F:	drivers/i2c/busses/i2c-ali1563.c
> 
> +ALIBABA ELASTIC ETHERNET ADAPTOR DRIVER
> +M:	Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
> +M:	Wen Gu <guwen@...ux.alibaba.com>
> +R:	Philo Lu <lulie@...ux.alibaba.com>
> +L:	netdev@...r.kernel.org
> +S:	Supported

this is reserved for companies that run netdev-ci tests on their HW

> +F:	drivers/net/ethernet/alibaba/eea
> +
>   ALIBABA ELASTIC RDMA DRIVER
>   M:	Cheng Xu <chengyou@...ux.alibaba.com>
>   M:	Kai Shen <kaishen@...ux.alibaba.com>
> diff --git a/drivers/net/ethernet/Kconfig b/drivers/net/ethernet/Kconfig
> index aead145dd91d..307c68a4fd53 100644
> --- a/drivers/net/ethernet/Kconfig
> +++ b/drivers/net/ethernet/Kconfig
> @@ -22,6 +22,7 @@ source "drivers/net/ethernet/aeroflex/Kconfig"
>   source "drivers/net/ethernet/agere/Kconfig"
>   source "drivers/net/ethernet/airoha/Kconfig"
>   source "drivers/net/ethernet/alacritech/Kconfig"
> +source "drivers/net/ethernet/alibaba/Kconfig"
>   source "drivers/net/ethernet/allwinner/Kconfig"
>   source "drivers/net/ethernet/alteon/Kconfig"
>   source "drivers/net/ethernet/altera/Kconfig"
> diff --git a/drivers/net/ethernet/Makefile b/drivers/net/ethernet/Makefile
> index 998dd628b202..358d88613cf4 100644
> --- a/drivers/net/ethernet/Makefile
> +++ b/drivers/net/ethernet/Makefile
> @@ -12,6 +12,7 @@ obj-$(CONFIG_NET_VENDOR_ADI) += adi/
>   obj-$(CONFIG_NET_VENDOR_AGERE) += agere/
>   obj-$(CONFIG_NET_VENDOR_AIROHA) += airoha/
>   obj-$(CONFIG_NET_VENDOR_ALACRITECH) += alacritech/
> +obj-$(CONFIG_NET_VENDOR_ALIBABA) += alibaba/
>   obj-$(CONFIG_NET_VENDOR_ALLWINNER) += allwinner/
>   obj-$(CONFIG_NET_VENDOR_ALTEON) += alteon/
>   obj-$(CONFIG_ALTERA_TSE) += altera/
> diff --git a/drivers/net/ethernet/alibaba/Kconfig b/drivers/net/ethernet/alibaba/Kconfig
> new file mode 100644
> index 000000000000..4040666ce129
> --- /dev/null
> +++ b/drivers/net/ethernet/alibaba/Kconfig
> @@ -0,0 +1,29 @@
> +#
> +# Alibaba network device configuration
> +#
> +
> +config NET_VENDOR_ALIBABA
> +	bool "Alibaba Devices"
> +	default y
> +	help
> +	  If you have a network (Ethernet) device belonging to this class, say Y.
> +
> +	  Note that the answer to this question doesn't directly affect the

I would say it does (at least with your proposed `default m` below)

> +	  kernel: saying N will just cause the configurator to skip all
> +	  the questions about Alibaba devices. If you say Y, you will be asked
> +	  for your specific device in the following questions.
> +
> +if NET_VENDOR_ALIBABA
> +
> +config EEA
> +	tristate "Alibaba Elastic Ethernet Adaptor support"
> +	depends on PCI_MSI
> +	depends on 64BIT
> +	select PAGE_POOL
> +	default m
> +	help
> +	  This driver supports Alibaba Elastic Ethernet Adaptor"
> +
> +	  To compile this driver as a module, choose M here.
> +
> +endif #NET_VENDOR_ALIBABA
> diff --git a/drivers/net/ethernet/alibaba/Makefile b/drivers/net/ethernet/alibaba/Makefile
> new file mode 100644
> index 000000000000..7980525cb086
> --- /dev/null
> +++ b/drivers/net/ethernet/alibaba/Makefile
> @@ -0,0 +1,5 @@
> +#
> +# Makefile for the Alibaba network device drivers.
> +#
> +
> +obj-$(CONFIG_EEA) += eea/
> diff --git a/drivers/net/ethernet/alibaba/eea/Makefile b/drivers/net/ethernet/alibaba/eea/Makefile
> new file mode 100644
> index 000000000000..bf2dad05e09a
> --- /dev/null
> +++ b/drivers/net/ethernet/alibaba/eea/Makefile
> @@ -0,0 +1,9 @@
> +
> +obj-$(CONFIG_EEA) += eea.o
> +eea-objs := eea_ring.o \

eea-y := ...
as *-objs suffix is reserved rather for (user-space) host programs while
     usually *-y suffix is used for kernel drivers (although *-objs works
     for that purpose for now).

would be also good to start with this (and other, like #includes) lists
sorted

> +	eea_net.o \
> +	eea_pci.o \
> +	eea_adminq.o \
> +	eea_ethtool.o \
> +	eea_tx.o \
> +	eea_rx.o
> diff --git a/drivers/net/ethernet/alibaba/eea/eea_adminq.c b/drivers/net/ethernet/alibaba/eea/eea_adminq.c
> new file mode 100644
> index 000000000000..625dd27bfb5d
> --- /dev/null
> +++ b/drivers/net/ethernet/alibaba/eea/eea_adminq.c
> @@ -0,0 +1,452 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * Driver for Alibaba Elastic Ethernet Adaptor.
> + *
> + * Copyright (C) 2025 Alibaba Inc.
> + */
> +
> +#include <linux/etherdevice.h>
> +#include <linux/utsname.h>
> +#include <linux/iopoll.h>
> +#include <linux/version.h>

would be great to sort includes
BTW, do you need version.h? (IWYU, but not more)

> +
> +#include "eea_net.h"
> +#include "eea_ring.h"
> +#include "eea_adminq.h"
> +#include "eea_pci.h"
> +
> +#define EEA_AQ_CMD_CFG_QUERY         ((0 << 8) | 0)
> +
> +#define EEA_AQ_CMD_QUEUE_CREATE      ((1 << 8) | 0)
> +#define EEA_AQ_CMD_QUEUE_DESTROY_ALL ((1 << 8) | 1)
> +#define EEA_AQ_CMD_QUEUE_DESTROY_Q   ((1 << 8) | 2)
> +
> +#define EEA_AQ_CMD_HOST_INFO         ((2 << 8) | 0)
> +
> +#define EEA_AQ_CMD_DEV_STATUS        ((3 << 8) | 0)
> +
> +#define ERING_DESC_F_AQ_PHASE	     (BIT(15) | BIT(7))

EEA_ prefix

> +
> +struct eea_aq_create {
> +#define EEA_QUEUE_FLAGS_HW_SPLIT_HDR BIT(0)
> +#define EEA_QUEUE_FLAGS_SQCQ         BIT(1)
> +#define EEA_QUEUE_FLAGS_HWTS         BIT(2)

move #defines out ouf struct definitions

> +	__le32 flags;
> +	/* queue index.
> +	 * rx: 0 == qidx % 2
> +	 * tx: 1 == qidx % 2
> +	 */
> +	__le16 qidx;
> +	/* the depth of the queue */
> +	__le16 depth;
> +	/*  0: without SPLIT HDR
> +	 *  1: 128B
> +	 *  2: 256B
> +	 *  3: 512B
> +	 */
> +	u8 hdr_buf_size;
> +	u8 sq_desc_size;
> +	u8 cq_desc_size;
> +	u8 reserve0;
> +	/* The vector for the irq. rx,tx share the same vector */
> +	__le16 msix_vector;
> +	__le16 reserve;
> +	/* sq ring cfg. */
> +	__le32 sq_addr_low;
> +	__le32 sq_addr_high;
> +	/* cq ring cfg. Just valid when flags include EEA_QUEUE_FLAGS_SQCQ. */
> +	__le32 cq_addr_low;
> +	__le32 cq_addr_high;
> +};
> +
> +struct aq_queue_drv_status {

add eea_ prefix to all your structs, fuctions, and defines

> +	__le16 qidx;
> +
> +	__le16 sq_head;
> +	__le16 cq_head;
> +	__le16 reserved;
> +};
> +
> +struct eea_aq_host_info_cfg {
> +#define EEA_OS_DISTRO		0
> +#define EEA_DRV_TYPE		0
> +#define EEA_OS_LINUX		1
> +#define EEA_SPEC_VER_MAJOR	1
> +#define EEA_SPEC_VER_MINOR	0
> +	__le16	os_type;        /* Linux, Win.. */

really?

> +	__le16	os_dist;
> +	__le16	drv_type;
> +
> +	__le16	kern_ver_major;
> +	__le16	kern_ver_minor;
> +	__le16	kern_ver_sub_minor;
> +
> +	__le16	drv_ver_major;
> +	__le16	drv_ver_minor;
> +	__le16	drv_ver_sub_minor;
> +
> +	__le16	spec_ver_major;
> +	__le16	spec_ver_minor;
> +	__le16	pci_bdf;
> +	__le32	pci_domain;
> +
> +	u8      os_ver_str[64];
> +	u8      isa_str[64];
> +};
> +
> +struct eea_aq_host_info_rep {
> +#define EEA_HINFO_MAX_REP_LEN	1024
> +#define EEA_HINFO_REP_PASS	1
> +#define EEA_HINFO_REP_REJECT	2
> +	u8	op_code;
> +	u8	has_reply;
> +	u8	reply_str[EEA_HINFO_MAX_REP_LEN];
> +};
> +
> +static struct ering *qid_to_ering(struct eea_net *enet, u32 qid)
> +{
> +	struct ering *ering;
> +
> +	if (qid % 2 == 0)
> +		ering = enet->rx[qid / 2]->ering;
> +	else
> +		ering = enet->tx[qid / 2].ering;
> +
> +	return ering;
> +}
> +
> +#define EEA_AQ_TIMEOUT_US (60 * 1000 * 1000)
> +
> +static int eea_adminq_submit(struct eea_net *enet, u16 cmd,
> +			     dma_addr_t req_addr, dma_addr_t res_addr,
> +			     u32 req_size, u32 res_size)
> +{
> +	struct eea_aq_cdesc *cdesc;
> +	struct eea_aq_desc *desc;
> +	int ret;
> +
> +	desc = ering_aq_alloc_desc(enet->adminq.ring);
> +
> +	desc->classid = cmd >> 8;
> +	desc->command = cmd & 0xff;
> +
> +	desc->data_addr = cpu_to_le64(req_addr);
> +	desc->data_len = cpu_to_le32(req_size);
> +
> +	desc->reply_addr = cpu_to_le64(res_addr);
> +	desc->reply_len = cpu_to_le32(res_size);
> +
> +	/* for update flags */
> +	wmb();
> +
> +	desc->flags = cpu_to_le16(enet->adminq.phase);
> +
> +	ering_sq_commit_desc(enet->adminq.ring);
> +
> +	ering_kick(enet->adminq.ring);
> +
> +	++enet->adminq.num;
> +
> +	if ((enet->adminq.num % enet->adminq.ring->num) == 0)
> +		enet->adminq.phase ^= ERING_DESC_F_AQ_PHASE;
> +
> +	ret = read_poll_timeout(ering_cq_get_desc, cdesc, cdesc, 0,
> +				EEA_AQ_TIMEOUT_US, false, enet->adminq.ring);
> +	if (ret)
> +		return ret;
> +
> +	ret = le32_to_cpu(cdesc->status);
> +
> +	ering_cq_ack_desc(enet->adminq.ring, 1);
> +
> +	if (ret)
> +		netdev_err(enet->netdev,
> +			   "adminq exec failed. cmd: %d ret %d\n", cmd, ret);
> +
> +	return ret;
> +}
> +
> +static int eea_adminq_exec(struct eea_net *enet, u16 cmd,
> +			   void *req, u32 req_size, void *res, u32 res_size)
> +{
> +	dma_addr_t req_addr, res_addr;
> +	struct device *dma;
> +	int ret;
> +
> +	dma = enet->edev->dma_dev;
> +
> +	req_addr = 0;
> +	res_addr = 0;
> +
> +	if (req) {
> +		req_addr = dma_map_single(dma, req, req_size, DMA_TO_DEVICE);

no idea how much adminq you need to send, but it feels wrong to do dma
mapping each time

> +		if (unlikely(dma_mapping_error(dma, req_addr)))
> +			return -ENOMEM;
> +	}
> +
> +	if (res) {
> +		res_addr = dma_map_single(dma, res, res_size, DMA_FROM_DEVICE);
> +		if (unlikely(dma_mapping_error(dma, res_addr))) {
> +			ret = -ENOMEM;
> +			goto err_map_res;
> +		}
> +	}
> +
> +	ret = eea_adminq_submit(enet, cmd, req_addr, res_addr,
> +				req_size, res_size);
> +
> +	if (res)
> +		dma_unmap_single(dma, res_addr, res_size, DMA_FROM_DEVICE);
> +
> +err_map_res:
> +	if (req)
> +		dma_unmap_single(dma, req_addr, req_size, DMA_TO_DEVICE);
> +
> +	return ret;
> +}
> +
> +void eea_destroy_adminq(struct eea_net *enet)
> +{
> +	/* Unactive adminq by device reset. So the device reset should be called

s/Unactive/Deactivate/
but this comment does not belong here, rather to the caller/s

> +	 * before this.
> +	 */
> +	if (enet->adminq.ring) {
> +		ering_free(enet->adminq.ring);
> +		enet->adminq.ring = NULL;
> +		enet->adminq.phase = 0;
> +	}
> +}
> +
> +int eea_create_adminq(struct eea_net *enet, u32 qid)
> +{
> +	struct ering *ering;
> +	int err;
> +
> +	ering = ering_alloc(qid, 64, enet->edev, sizeof(struct eea_aq_desc),
> +			    sizeof(struct eea_aq_desc), "adminq");
> +	if (!ering)
> +		return -ENOMEM;
> +
> +	err = eea_pci_active_aq(ering);
> +	if (err) {
> +		ering_free(ering);
> +		return -EBUSY;
> +	}
> +
> +	enet->adminq.ring = ering;
> +	enet->adminq.phase = BIT(7);
> +	enet->adminq.num = 0;
> +
> +	/* set device ready to active adminq */
> +	eea_device_ready(enet->edev);
> +
> +	return 0;
> +}
> +
> +int eea_adminq_query_cfg(struct eea_net *enet, struct eea_aq_cfg *cfg)
> +{
> +	return eea_adminq_exec(enet, EEA_AQ_CMD_CFG_QUERY, NULL, 0, cfg,
> +			       sizeof(*cfg));
> +}
> +
> +static void qcfg_fill(struct eea_aq_create *qcfg, struct ering *ering,
> +		      u32 flags)
> +{
> +	qcfg->flags = cpu_to_le32(flags);
> +	qcfg->qidx = cpu_to_le16(ering->index);
> +	qcfg->depth = cpu_to_le16(ering->num);
> +
> +	qcfg->hdr_buf_size = flags & EEA_QUEUE_FLAGS_HW_SPLIT_HDR ? 1 : 0;

x = !!y instead of x = y ? 1 : 0
(when needed to map to 0/1)

> +	qcfg->sq_desc_size = ering->sq.desc_size;
> +	qcfg->cq_desc_size = ering->cq.desc_size;
> +	qcfg->msix_vector = cpu_to_le16(ering->msix_vec);
> +
> +	qcfg->sq_addr_low = cpu_to_le32(ering->sq.dma_addr);
> +	qcfg->sq_addr_high = cpu_to_le32(ering->sq.dma_addr >> 32);
> +
> +	qcfg->cq_addr_low = cpu_to_le32(ering->cq.dma_addr);
> +	qcfg->cq_addr_high = cpu_to_le32(ering->cq.dma_addr >> 32);
> +}
> +
> +int eea_adminq_create_q(struct eea_net *enet, u32 qidx, u32 num, u32 flags)
> +{
> +	struct device *dev = enet->edev->dma_dev;
> +	int i, err, db_size, q_size, qid;
> +	struct eea_aq_create *q_buf;
> +	dma_addr_t db_dma, q_dma;
> +	struct eea_net_cfg *cfg;
> +	struct ering *ering;
> +	__le32 *db_buf;
> +
> +	err = -ENOMEM;
> +
> +	cfg = &enet->cfg;
> +
> +	if (cfg->split_hdr)
> +		flags |= EEA_QUEUE_FLAGS_HW_SPLIT_HDR;
> +
> +	flags |= EEA_QUEUE_FLAGS_SQCQ;
> +	flags |= EEA_QUEUE_FLAGS_HWTS;
> +
> +	db_size = sizeof(int) * num;
> +	q_size = sizeof(struct eea_aq_create) * num;

would be best to use struct_size()

> +
> +	db_buf = dma_alloc_coherent(dev, db_size, &db_dma, GFP_KERNEL);
> +	if (!db_buf)
> +		return err;
> +
> +	q_buf = dma_alloc_coherent(dev, q_size, &q_dma, GFP_KERNEL);
> +	if (!q_buf)
> +		goto err_db;
> +
> +	qid = qidx;
> +	for (i = 0; i < num; i++, qid++)
> +		qcfg_fill(q_buf + i, qid_to_ering(enet, qid), flags);
> +
> +	err = eea_adminq_exec(enet, EEA_AQ_CMD_QUEUE_CREATE,
> +			      q_buf, q_size, db_buf, db_size);
> +	if (err)
> +		goto err;
> +	err = 0;

this assignement makes no sense (please check if there is something
missing in your source)

> +
> +	qid = qidx;
> +	for (i = 0; i < num; i++, qid++) {
> +		ering = qid_to_ering(enet, qid);
> +		ering->db = eea_pci_db_addr(ering->edev,
> +					    le32_to_cpu(db_buf[i]));
> +	}
> +
> +err:
> +	dma_free_coherent(dev, q_size, q_buf, q_dma);
> +err_db:
> +	dma_free_coherent(dev, db_size, db_buf, db_dma);
> +	return err;
> +}
> +
> +int eea_adminq_destroy_q(struct eea_net *enet, u32 qidx, int num)
> +{
> +	struct device *dev = enet->edev->dma_dev;
> +	dma_addr_t dma_addr;
> +	__le16 *buf;
> +	u32 size;
> +	int i;
> +
> +	if (qidx == 0 && num == -1)
> +		return eea_adminq_exec(enet, EEA_AQ_CMD_QUEUE_DESTROY_ALL,
> +				       NULL, 0, NULL, 0);
> +
> +	size = sizeof(__le16) * num;
> +	buf = dma_alloc_coherent(dev, size, &dma_addr, GFP_KERNEL);
> +	if (!buf)
> +		return -ENOMEM;
> +
> +	for (i = 0; i < num; ++i)
> +		buf[i] = cpu_to_le16(qidx++);
> +
> +	return eea_adminq_exec(enet, EEA_AQ_CMD_QUEUE_DESTROY_Q,
> +			       buf, size, NULL, 0);
> +}
> +
> +struct aq_dev_status *eea_adminq_dev_status(struct eea_net *enet)
> +{
> +	struct aq_queue_drv_status *drv_status;
> +	struct aq_dev_status *dev_status;
> +	int err, i, num, size;
> +	struct ering *ering;
> +	void *rep, *req;
> +
> +	num = enet->cfg.tx_ring_num * 2 + 1;
> +
> +	req = kcalloc(num, sizeof(struct aq_queue_drv_status), GFP_KERNEL);

sizeof(*ptr) instead of sizeof(type) when possible

> +	if (!req)
> +		return NULL;
> +
> +	size = struct_size(dev_status, q_status, num);
> +
> +	rep = kmalloc(size, GFP_KERNEL);
> +	if (!rep) {
> +		kfree(req);
> +		return NULL;
> +	}
> +
> +	drv_status = req;
> +	for (i = 0; i < enet->cfg.rx_ring_num * 2; ++i, ++drv_status) {
> +		ering = qid_to_ering(enet, i);
> +		drv_status->qidx = cpu_to_le16(i);
> +		drv_status->cq_head = cpu_to_le16(ering->cq.head);
> +		drv_status->sq_head = cpu_to_le16(ering->sq.head);
> +	}
> +
> +	drv_status->qidx = cpu_to_le16(i);
> +	drv_status->cq_head = cpu_to_le16(enet->adminq.ring->cq.head);
> +	drv_status->sq_head = cpu_to_le16(enet->adminq.ring->sq.head);
> +
> +	err = eea_adminq_exec(enet, EEA_AQ_CMD_DEV_STATUS,
> +			      req, num * sizeof(struct aq_queue_drv_status),
> +			      rep, size);
> +	kfree(req);
> +	if (err) {
> +		kfree(rep);
> +		return NULL;
> +	}
> +
> +	return rep;
> +}
> +
> +int eea_adminq_config_host_info(struct eea_net *enet)
> +{
> +	struct device *dev = enet->edev->dma_dev;
> +	struct eea_aq_host_info_cfg *cfg;
> +	struct eea_aq_host_info_rep *rep;
> +	int rc = -ENOMEM;
> +
> +	cfg = kzalloc(sizeof(*cfg), GFP_KERNEL);
> +	if (!cfg)
> +		return rc;
> +
> +	rep = kzalloc(sizeof(*rep), GFP_KERNEL);
> +	if (!rep)
> +		goto free_cfg;
> +
> +	cfg->os_type            = cpu_to_le16(EEA_OS_LINUX);
> +	cfg->os_dist            = cpu_to_le16(EEA_OS_DISTRO);
> +	cfg->drv_type           = cpu_to_le16(EEA_DRV_TYPE);
> +
> +	cfg->kern_ver_major     = cpu_to_le16(LINUX_VERSION_MAJOR);
> +	cfg->kern_ver_minor     = cpu_to_le16(LINUX_VERSION_PATCHLEVEL);
> +	cfg->kern_ver_sub_minor = cpu_to_le16(LINUX_VERSION_SUBLEVEL);
> +
> +	cfg->drv_ver_major      = cpu_to_le16(EEA_VER_MAJOR);
> +	cfg->drv_ver_minor      = cpu_to_le16(EEA_VER_MINOR);
> +	cfg->drv_ver_sub_minor  = cpu_to_le16(EEA_VER_SUB_MINOR);
> +
> +	cfg->spec_ver_major     = cpu_to_le16(EEA_SPEC_VER_MAJOR);
> +	cfg->spec_ver_minor     = cpu_to_le16(EEA_SPEC_VER_MINOR);
> +
> +	cfg->pci_bdf            = cpu_to_le16(eea_pci_dev_id(enet->edev));
> +	cfg->pci_domain         = cpu_to_le32(eea_pci_domain_nr(enet->edev));
> +
> +	strscpy(cfg->os_ver_str, utsname()->release, sizeof(cfg->os_ver_str));
> +	strscpy(cfg->isa_str, utsname()->machine, sizeof(cfg->isa_str));

what is the benefit of this?

> +
> +	rc = eea_adminq_exec(enet, EEA_AQ_CMD_HOST_INFO,
> +			     cfg, sizeof(*cfg), rep, sizeof(*rep));
> +
> +	if (!rc) {
> +		if (rep->op_code == EEA_HINFO_REP_REJECT) {
> +			dev_err(dev, "Device has refused the initialization "
> +				"due to provided host information\n");

do not break strings that users are supposed to grep for

> +			rc = -ENODEV;
> +		}
> +		if (rep->has_reply) {
> +			rep->reply_str[EEA_HINFO_MAX_REP_LEN - 1] = '\0';
> +			dev_warn(dev, "Device replied in host_info config: %s",
> +				 rep->reply_str);
> +		}
> +	}
> +
> +	kfree(rep);
> +free_cfg:
> +	kfree(cfg);
> +	return rc;
> +}



> +static void eea_stats_fill_for_q(struct u64_stats_sync *syncp, u32 num,
> +				 const struct eea_stat_desc *desc,
> +				 u64 *data, u32 idx)
> +{
> +	void *stats_base = (void *)syncp;

needles cast

> +	u32 start, i;
> +
> +	do {
> +		start = u64_stats_fetch_begin(syncp);
> +		for (i = 0; i < num; i++)
> +			data[idx + i] =
> +				u64_stats_read(stats_base + desc[i].offset);

nice trick

> +
> +	} while (u64_stats_fetch_retry(syncp, start));
> +}
> +



> +static int eea_netdev_init_features(struct net_device *netdev,
> +				    struct eea_net *enet,
> +				    struct eea_device *edev)
> +{
> +	struct eea_aq_cfg *cfg __free(kfree) = NULL;
> +	int err;
> +	u32 mtu;
> +
> +	cfg = kmalloc(sizeof(*cfg), GFP_KERNEL);
> +
> +	err = eea_adminq_query_cfg(enet, cfg);
> +

no blank line between value assignement and error check

> +	if (err)
> +		return err;
> +
> +	eea_update_cfg(enet, edev, cfg);
> +
> +	netdev->priv_flags |= IFF_UNICAST_FLT;
> +	netdev->priv_flags |= IFF_LIVE_ADDR_CHANGE;
> +
> +	netdev->hw_features |= NETIF_F_HW_CSUM;
> +	netdev->hw_features |= NETIF_F_GRO_HW;
> +	netdev->hw_features |= NETIF_F_SG;
> +	netdev->hw_features |= NETIF_F_TSO;
> +	netdev->hw_features |= NETIF_F_TSO_ECN;
> +	netdev->hw_features |= NETIF_F_TSO6;
> +	netdev->hw_features |= NETIF_F_GSO_UDP_L4;
> +
> +	netdev->features |= NETIF_F_HIGHDMA;
> +	netdev->features |= NETIF_F_HW_CSUM;
> +	netdev->features |= NETIF_F_SG;
> +	netdev->features |= NETIF_F_GSO_ROBUST;
> +	netdev->features |= netdev->hw_features & NETIF_F_ALL_TSO;
> +	netdev->features |= NETIF_F_RXCSUM;
> +	netdev->features |= NETIF_F_GRO_HW;
> +
> +	netdev->vlan_features = netdev->features;
> +
> +	eth_hw_addr_set(netdev, cfg->mac);
> +
> +	enet->speed = SPEED_UNKNOWN;
> +	enet->duplex = DUPLEX_UNKNOWN;
> +
> +	netdev->min_mtu = ETH_MIN_MTU;
> +
> +	mtu = le16_to_cpu(cfg->mtu);
> +	if (mtu < netdev->min_mtu) {
> +		dev_err(edev->dma_dev, "The device gave us an invalid MTU. "
> +			"Here we can only exit the initialization. %d < %d",
> +			mtu, netdev->min_mtu);
> +		return -EINVAL;
> +	}
> +
> +	netdev->mtu = mtu;
> +
> +	/* If jumbo frames are already enabled, then the returned MTU will be a
> +	 * jumbo MTU, and the driver will automatically enable jumbo frame
> +	 * support by default.
> +	 */
> +	netdev->max_mtu = mtu;
> +
> +	netif_carrier_on(netdev);
> +
> +	return 0;
> +}
> +
> +static const struct net_device_ops eea_netdev = {
> +	.ndo_open           = eea_netdev_open,
> +	.ndo_stop           = eea_netdev_stop,
> +	.ndo_start_xmit     = eea_tx_xmit,
> +	.ndo_validate_addr  = eth_validate_addr,
> +	.ndo_get_stats64    = eea_stats,
> +	.ndo_features_check = passthru_features_check,
> +	.ndo_tx_timeout     = eea_tx_timeout,
> +};
> +
> +static struct eea_net *eea_netdev_alloc(struct eea_device *edev, u32 pairs)
> +{
> +	struct net_device *netdev;
> +	struct eea_net *enet;
> +
> +	netdev = alloc_etherdev_mq(sizeof(struct eea_net), pairs);
> +	if (!netdev) {
> +		dev_err(edev->dma_dev,
> +			"alloc_etherdev_mq failed with pairs %d\n", pairs);
> +		return NULL;
> +	}
> +
> +	netdev->netdev_ops = &eea_netdev;
> +	netdev->ethtool_ops = &eea_ethtool_ops;
> +	SET_NETDEV_DEV(netdev, edev->dma_dev);
> +
> +	enet = netdev_priv(netdev);
> +	enet->netdev = netdev;
> +	enet->edev = edev;
> +	edev->enet = enet;
> +
> +	return enet;
> +}
> +
> +static void eea_update_ts_off(struct eea_device *edev, struct eea_net *enet)
> +{
> +	u64 ts;
> +
> +	ts = eea_pci_device_ts(edev);
> +
> +	enet->hw_ts_offset = ktime_get_real() - ts;
> +}
> +
> +static int eea_net_reprobe(struct eea_device *edev)
> +{
> +	struct eea_net *enet = edev->enet;
> +	int err = 0;
> +
> +	enet->edev = edev;
> +
> +	if (!enet->adminq.ring) {
> +		err = eea_create_adminq(enet, edev->rx_num + edev->tx_num);
> +		if (err)
> +			return err;
> +	}
> +
> +	eea_update_ts_off(edev, enet);
> +
> +	if (edev->ha_reset_netdev_running) {
> +		rtnl_lock();
> +		enet->link_err = 0;
> +		err = eea_netdev_open(enet->netdev);
> +		rtnl_unlock();
> +	}
> +
> +	return err;
> +}
> +
> +int eea_net_probe(struct eea_device *edev)
> +{
> +	struct eea_net *enet;
> +	int err = -ENOMEM;
> +
> +	if (edev->ha_reset)
> +		return eea_net_reprobe(edev);
> +
> +	enet = eea_netdev_alloc(edev, edev->rx_num);
> +	if (!enet)
> +		return -ENOMEM;
> +
> +	err = eea_create_adminq(enet, edev->rx_num + edev->tx_num);
> +	if (err)
> +		goto err_adminq;

would be best to name goto labels after what they do instead of
from-where-you-jump-there

> +
> +	err = eea_adminq_config_host_info(enet);
> +	if (err)
> +		goto err_hinfo;
> +
> +	err = eea_netdev_init_features(enet->netdev, enet, edev);
> +	if (err)
> +		goto err_feature;
> +
> +	err = register_netdev(enet->netdev);
> +	if (err)
> +		goto err_ready;
> +
> +	eea_update_ts_off(edev, enet);
> +	netif_carrier_off(enet->netdev);
> +
> +	netdev_dbg(enet->netdev, "eea probe success.\n");
> +
> +	return 0;
> +
> +err_ready:
> +err_feature:
> +err_hinfo:
> +	eea_device_reset(edev);
> +	eea_destroy_adminq(enet);
> +
> +err_adminq:
> +	free_netdev(enet->netdev);
> +	return err;
> +}
> +
> +void eea_net_remove(struct eea_device *edev)
> +{
> +	struct net_device *netdev;
> +	struct eea_net *enet;
> +
> +	enet = edev->enet;
> +	netdev = enet->netdev;
> +
> +	if (edev->ha_reset) {
> +		edev->ha_reset_netdev_running = false;
> +		if (netif_running(enet->netdev)) {
> +			rtnl_lock();
> +			eea_netdev_stop(enet->netdev);
> +			enet->link_err = EEA_LINK_ERR_HA_RESET_DEV;
> +			enet->edev = NULL;
> +			rtnl_unlock();
> +			edev->ha_reset_netdev_running = true;
> +		}
> +	} else {
> +		unregister_netdev(netdev);
> +		netdev_dbg(enet->netdev, "eea removed.\n");
> +	}
> +
> +	eea_device_reset(edev);
> +
> +	/* free adminq */

please remove comments that say nothing more that code around

> +	eea_destroy_adminq(enet);
> +
> +	if (!edev->ha_reset)
> +		free_netdev(netdev);
> +}
> diff --git a/drivers/net/ethernet/alibaba/eea/eea_net.h b/drivers/net/ethernet/alibaba/eea/eea_net.h
> new file mode 100644
> index 000000000000..74e6a76c1f7f
> --- /dev/null
> +++ b/drivers/net/ethernet/alibaba/eea/eea_net.h
> @@ -0,0 +1,196 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +/*
> + * Driver for Alibaba Elastic Ethernet Adaptor.
> + *
> + * Copyright (C) 2025 Alibaba Inc.
> + */
> +
> +#ifndef __EEA_NET_H__
> +#define __EEA_NET_H__
> +
> +#include <linux/netdevice.h>
> +#include <linux/ethtool.h>
> +
> +#include "eea_ring.h"
> +#include "eea_ethtool.h"
> +#include "eea_adminq.h"
> +
> +#define EEA_VER_MAJOR		1
> +#define EEA_VER_MINOR		0
> +#define EEA_VER_SUB_MINOR	0
> +
> +struct eea_tx_meta;
> +
> +struct eea_reprobe {

would be best to name structs as nouns, and reserve verbs for functions

> +	struct eea_net *enet;
> +	bool running_before_reprobe;
> +};
> +
> +struct enet_tx {
> +	struct eea_net *enet;
> +
> +	struct ering *ering;
> +
> +	struct eea_tx_meta *meta;
> +	struct eea_tx_meta *free;
> +
> +	struct device *dma_dev;
> +
> +	u32 index;
> +
> +	char name[16];
> +
> +	struct eea_tx_stats stats;
> +};
> +
> +struct eea_rx_meta {
> +	struct eea_rx_meta *next;
> +
> +	struct page *page;
> +	dma_addr_t dma;
> +	u32 offset;
> +	u32 frags;
> +
> +	struct page *hdr_page;
> +	void *hdr_addr;
> +	dma_addr_t hdr_dma;
> +
> +	u32 id;
> +
> +	u32 truesize;
> +	u32 headroom;
> +	u32 tailroom;
> +	u32 room;
> +
> +	u32 len;
> +};
> +
> +struct enet_rx_pkt_ctx {
> +	u16 idx;
> +
> +	bool data_valid;
> +	bool do_drop;
> +
> +	struct sk_buff *head_skb;
> +	struct sk_buff *curr_skb;
> +};
> +
> +struct enet_rx {
> +	struct eea_net *enet;
> +
> +	struct ering *ering;
> +
> +	struct eea_rx_meta *meta;
> +	struct eea_rx_meta *free;
> +
> +	struct device *dma_dev;
> +
> +	u32 index;
> +
> +	u32 flags;
> +
> +	u32 headroom;
> +
> +	struct napi_struct napi;
> +
> +	struct eea_rx_stats stats;
> +
> +	u16 irq_n;
> +
> +	char name[16];

IFNAMSIZ?

> +
> +	struct enet_rx_pkt_ctx pkt;
> +
> +	struct page_pool *pp;
> +};
> +
> +struct eea_net_cfg {
> +	u32 rx_ring_depth;
> +	u32 tx_ring_depth;
> +	u32 rx_ring_num;
> +	u32 tx_ring_num;
> +
> +	u8 rx_sq_desc_size;
> +	u8 rx_cq_desc_size;
> +	u8 tx_sq_desc_size;
> +	u8 tx_cq_desc_size;
> +
> +	u32 split_hdr;
> +
> +	struct hwtstamp_config ts_cfg;
> +};
> +
> +struct eea_net_tmp {
> +	struct eea_net_cfg   cfg;
> +
> +	struct enet_tx      *tx;
> +	struct enet_rx     **rx;
> +
> +	struct net_device   *netdev;
> +	struct eea_device   *edev;
> +};
> +
> +enum {
> +	EEA_LINK_ERR_NONE,
> +	EEA_LINK_ERR_HA_RESET_DEV,
> +	EEA_LINK_ERR_LINK_DOWN,
> +};
> +
> +struct eea_net {
> +	struct eea_device *edev;
> +	struct net_device *netdev;
> +
> +	struct eea_aq adminq;
> +
> +	struct enet_tx *tx;
> +	struct enet_rx **rx;
> +
> +	struct eea_net_cfg cfg;
> +	struct eea_net_cfg cfg_hw;
> +
> +	u32 link_err;
> +
> +	bool started;
> +	bool cpu_aff_set;
> +
> +	u8 duplex;
> +	u32 speed;
> +
> +	u64 hw_ts_offset;
> +};
> +
> +int eea_tx_resize(struct eea_net *enet, struct enet_tx *tx, u32 ring_num);
> +
> +int eea_net_probe(struct eea_device *edev);
> +void eea_net_remove(struct eea_device *edev);
> +int eea_net_freeze(struct eea_device *edev);
> +int eea_net_restore(struct eea_device *edev);
> +
> +int eea_reset_hw_resources(struct eea_net *enet, struct eea_net_tmp *tmp);
> +void enet_mk_tmp_cfg(struct eea_net *enet, struct eea_net_tmp *tmp);
> +int eea_queues_check_and_reset(struct eea_device *edev);
> +
> +/* rx apis */
> +int eea_poll(struct napi_struct *napi, int budget);
> +
> +void enet_rx_stop(struct enet_rx *rx);
> +int enet_rx_start(struct enet_rx *rx);
> +
> +void eea_free_rx(struct enet_rx *rx);
> +struct enet_rx *eea_alloc_rx(struct eea_net_tmp *tmp, u32 idx);
> +
> +int eea_irq_free(struct enet_rx *rx);
> +
> +int enet_rxtx_irq_setup(struct eea_net *enet, u32 qid, u32 num);
> +
> +/* tx apis */
> +int eea_poll_tx(struct enet_tx *tx, int budget);
> +void eea_poll_cleantx(struct enet_rx *rx);
> +netdev_tx_t eea_tx_xmit(struct sk_buff *skb, struct net_device *netdev);
> +
> +void eea_tx_timeout(struct net_device *netdev, u32 txqueue);
> +
> +void eea_free_tx(struct enet_tx *tx);
> +int eea_alloc_tx(struct eea_net_tmp *tmp, struct enet_tx *tx, u32 idx);
> +
> +#endif
> diff --git a/drivers/net/ethernet/alibaba/eea/eea_pci.c b/drivers/net/ethernet/alibaba/eea/eea_pci.c
> new file mode 100644
> index 000000000000..df84f9a9c543
> --- /dev/null
> +++ b/drivers/net/ethernet/alibaba/eea/eea_pci.c
> @@ -0,0 +1,574 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * Driver for Alibaba Elastic Ethernet Adaptor.
> + *
> + * Copyright (C) 2025 Alibaba Inc.
> + */
> +
> +#include <linux/iopoll.h>
> +#include <linux/io-64-nonatomic-lo-hi.h>
> +
> +#include "eea_pci.h"
> +#include "eea_net.h"
> +
> +#define EEA_PCI_DB_OFFSET 4096
> +
> +#define EEA_PCI_CAP_RESET_DEVICE 0xFA
> +#define EEA_PCI_CAP_RESET_FLAG BIT(1)
> +
> +struct eea_pci_cfg {
> +	__le32 reserve0;
> +	__le32 reserve1;
> +	__le32 drv_f_idx;
> +	__le32 drv_f;
> +
> +#define EEA_S_OK           BIT(2)
> +#define EEA_S_FEATURE_DONE BIT(3)
> +#define EEA_S_FAILED       BIT(7)
> +	u8   device_status;
> +	u8   reserved[7];
> +
> +	__le32 rx_num_max;
> +	__le32 tx_num_max;
> +	__le32 db_blk_size;
> +
> +	/* admin queue cfg */
> +	__le16 aq_size;
> +	__le16 aq_msix_vector;
> +	__le32 aq_db_off;
> +
> +	__le32 aq_sq_addr;
> +	__le32 aq_sq_addr_hi;
> +	__le32 aq_cq_addr;
> +	__le32 aq_cq_addr_hi;
> +
> +	__le64 hw_ts;
> +};
> +
> +struct eea_pci_device {
> +	struct eea_device edev;
> +	struct pci_dev *pci_dev;
> +
> +	u32 msix_vec_n;
> +
> +	void __iomem *reg;
> +	void __iomem *db_base;
> +
> +	struct work_struct ha_handle_work;
> +	char ha_irq_name[32];
> +	u8 reset_pos;
> +};
> +
> +#define cfg_pointer(reg, item) \
> +	((void __iomem *)((reg) + offsetof(struct eea_pci_cfg, item)))
> +
> +#define cfg_write8(reg, item, val) iowrite8(val, cfg_pointer(reg, item))
> +#define cfg_write16(reg, item, val) iowrite16(val, cfg_pointer(reg, item))
> +#define cfg_write32(reg, item, val) iowrite32(val, cfg_pointer(reg, item))
> +#define cfg_write64(reg, item, val) iowrite64_lo_hi(val, cfg_pointer(reg, item))
> +
> +#define cfg_read8(reg, item) ioread8(cfg_pointer(reg, item))
> +#define cfg_read32(reg, item) ioread32(cfg_pointer(reg, item))
> +#define cfg_readq(reg, item) readq(cfg_pointer(reg, item))
> +
> +/* Due to circular references, we have to add function definitions here. */
> +static int __eea_pci_probe(struct pci_dev *pci_dev,
> +			   struct eea_pci_device *ep_dev);
> +static void __eea_pci_remove(struct pci_dev *pci_dev, bool flush_ha_work);
> +
> +const char *eea_pci_name(struct eea_device *edev)

generally such a thin wrappers for kernel API are discouraged
(this driver would be part of the kernel, if someone will change
function that you call in a way that requires change of caller, they
will also change your driver;
it is also easier for reviewers and maintainers to see something
common to them instead of eea_pci_dev_id())

> +{
> +	return pci_name(edev->ep_dev->pci_dev);
> +}
> +
> +int eea_pci_domain_nr(struct eea_device *edev)
> +{
> +	return pci_domain_nr(edev->ep_dev->pci_dev->bus);
> +}
> +
> +u16 eea_pci_dev_id(struct eea_device *edev)
> +{
> +	return pci_dev_id(edev->ep_dev->pci_dev);
> +}
> +
> +static void eea_pci_io_set_status(struct eea_device *edev, u8 status)
> +{
> +	struct eea_pci_device *ep_dev = edev->ep_dev;
> +
> +	cfg_write8(ep_dev->reg, device_status, status);
> +}
> +
> +static u8 eea_pci_io_get_status(struct eea_device *edev)
> +{
> +	struct eea_pci_device *ep_dev = edev->ep_dev;
> +
> +	return cfg_read8(ep_dev->reg, device_status);
> +}
> +
> +static void eea_add_status(struct eea_device *dev, u32 status)
> +{
> +	eea_pci_io_set_status(dev, eea_pci_io_get_status(dev) | status);
> +}
> +

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ