netdev - Re: [PATCH net-next] eea: Add basic driver framework for Alibaba Elastic Ethernet Adaptor

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <322af656-d359-44d8-9e40-4f997a8b7e0f@lunn.ch>
Date: Wed, 16 Jul 2025 18:25:40 +0200
From: Andrew Lunn <andrew@...n.ch>
To: Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
Cc: netdev@...r.kernel.org, Andrew Lunn <andrew+netdev@...n.ch>,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	Wen Gu <guwen@...ux.alibaba.com>,
	Philo Lu <lulie@...ux.alibaba.com>,
	Lorenzo Bianconi <lorenzo@...nel.org>,
	Lukas Bulwahn <lukas.bulwahn@...hat.com>,
	Parthiban Veerasooran <Parthiban.Veerasooran@...rochip.com>,
	Geert Uytterhoeven <geert+renesas@...der.be>,
	Alexander Duyck <alexanderduyck@...com>,
	Dust Li <dust.li@...ux.alibaba.com>
Subject: Re: [PATCH net-next] eea: Add basic driver framework for Alibaba
 Elastic Ethernet Adaptor

On Wed, Jul 16, 2025 at 01:47:32PM +0800, Xuan Zhuo wrote:
> Thank you for your valuable feedback. We've addressed most of the comments
> and will include the fixes in the next version. A few remaining items are still
> under discussion and listed below for reference.
> 
> On Thu, 10 Jul 2025 15:45:38 +0200, Andrew Lunn <andrew@...n.ch> wrote:
> > > +module_param(aq_timeout, uint, 0644);
> >
> > No module params please.
> >
> > > +struct eea_aq_host_info_cfg {
> > > +#ifndef EEA_OS_DISTRO
> > > +#define EEA_OS_DISTRO		0
> > > +#endif
> > > +
> > > +#ifndef EEA_DRV_TYPE
> > > +#define EEA_DRV_TYPE		0
> > > +#endif
> > > +
> > > +#define EEA_OS_LINUX		1
> > > +#define EEA_SPEC_VER_MAJOR	1
> > > +#define EEA_SPEC_VER_MINOR	0
> > > +	__le16	os_type;        /* Linux, Win.. */
> > > +	__le16	os_dist;
> > > +	__le16	drv_type;
> > > +
> > > +	__le16	kern_ver_major;
> > > +	__le16	kern_ver_minor;
> > > +	__le16	kern_ver_sub_minor;
> > > +
> > > +	__le16	drv_ver_major;
> > > +	__le16	drv_ver_minor;
> > > +	__le16	drv_ver_sub_minor;
> > > +
> > > +	__le16	spec_ver_major;
> > > +	__le16	spec_ver_minor;
> > > +	__le16	pci_bdf;
> > > +	__le32	pci_domain;
> > > +
> > > +	u8      os_ver_str[64];
> > > +	u8      isa_str[64];
> >
> > Why does it care about the OS, kernel version etc?
> 
> Then the device can know the version, the dpu can do something for bug of the
> driver.

That is not a very good explanation. Do you see any other system in
Linux were the firmware works around bug in Linux drivers using the
kernel version?

You also need to think about enterprise kernels, like RedHat,
Oracle. They don't give a truthful kernel version, they have thousands
of patches on top fixing, and creating bugs. How will you handle that?

Please drop all this, and just fix the bugs in the driver.

> > > +	start = get_jiffies_64();
> > > +	while (!(cdesc = ering_cq_get_desc(enet->adminq.ring))) {
> > > +		cond_resched();
> > > +		cpu_relax();
> > > +
> > > +		timeout = secs_to_jiffies(READ_ONCE(aq_timeout));
> > > +		if (time_after64(get_jiffies_64(), start + timeout)) {
> > > +			netdev_err(enet->netdev, "admin queue timeout. timeout %d\n",
> > > +				   READ_ONCE(aq_timeout));
> > > +			return -1;
> > > +		}
> > > +	}
> >
> > See if you can one of the macros from iopoll.h
> 
> Here we do not access the pci register directly, if we use the iopoll.h
> we need to break the api ering_cq_get_desc. So I think we should not use
> the api of iopoll.h here.

#define read_poll_timeout(op, val, cond, sleep_us, timeout_us, \
                                sleep_before_read, args...) \
({ \
        u64 __timeout_us = (timeout_us); \
        unsigned long __sleep_us = (sleep_us); \
        ktime_t __timeout = ktime_add_us(ktime_get(), __timeout_us); \
        might_sleep_if((__sleep_us) != 0); \
        if (sleep_before_read && __sleep_us) \
                usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
        for (;;) { \
                (val) = op(args); \
                if (cond) \
                        break; \


op: ering_cq_get_desc.
val: cdesc.
args: enet->adminq.ring
cond: !cdesc

I might be wrong, but i think you can make this work.

> > > +static void eea_get_drvinfo(struct net_device *netdev,
> > > +			    struct ethtool_drvinfo *info)
> > > +{
> > > +	struct eea_net *enet = netdev_priv(netdev);
> > > +	struct eea_device *edev = enet->edev;
> > > +
> > > +	strscpy(info->driver,   KBUILD_MODNAME,     sizeof(info->driver));
> > > +	strscpy(info->bus_info, eea_pci_name(edev), sizeof(info->bus_info));
> > > +	snprintf(info->version, sizeof(info->version), "%d.%d.%d",
> > > +		 EEA_VER_MAJOR, EEA_VER_MINOR, EEA_VER_SUB_MINOR);
> >
> > A hard coded version is pointless, because it never changes, yet the
> > kernel around the driver changes every week. Don't set version, and
> > the core will fill in the git hash, which is useful.
> 
> In our plan, we will increase this version when we change the code.

So you will be submitting a patch for GregKH for every single stable
kernel? That will be around 5 patches, every two weeks, for the next
30 years?

As i said, the driver is not standalone, it is embedded within the
kernel. Changes to the kernel can break the driver. When you get a bug
report from a customer, and they say version v42 of the driver is
broken, isn't the first thing you are going to ask is what kernel
version? Is it a vendor kernel? Which vendor?

If you leave version unfilled, ethtool will report something like:

version: 6.12.29-amd64

which is much more useful.

> > > +	mtu = le16_to_cpu(cfg->mtu);
> > > +	if (mtu < netdev->min_mtu) {
> > > +		dev_err(edev->dma_dev, "device MTU too small. %d < %d", mtu, netdev->min_mtu);
> > > +		return -EINVAL;
> > > +	}
> > > +
> > > +	netdev->mtu = mtu;
> > > +	netdev->max_mtu = mtu;
> >
> > Setting mtu the same as max_mtu is unusual? Are you defaulting to jumbo?
> 
> In the cloud the dpu controls this.

No, linux controls this. This driver controls this, using these line
here. Userspace can change it later, but this is the default. And 99%
of Linux systems default to 1500. Is max_mtu 1500? Or is it some jumbo
value? You at last need to add a comment you are ignoring what
everybody else does and are setting the MTU to something larger,
because that is what your use case is.

You are aiming to write an Ethenet driver which looks pretty much like
every other Ethernet driver in Linux. When you do something different,
you need to justify it, add a comment why your device is special and
needs to do something different.

	Andrew