lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1753414106.2053263-1-xuanzhuo@linux.alibaba.com>
Date: Fri, 25 Jul 2025 11:28:26 +0800
From: Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
To: Andrew Lunn <andrew@...n.ch>
Cc: netdev@...r.kernel.org,
 Andrew Lunn <andrew+netdev@...n.ch>,
 "David S. Miller" <davem@...emloft.net>,
 Eric Dumazet <edumazet@...gle.com>,
 Jakub Kicinski <kuba@...nel.org>,
 Paolo Abeni <pabeni@...hat.com>,
 Wen Gu <guwen@...ux.alibaba.com>,
 Philo Lu <lulie@...ux.alibaba.com>,
 Lorenzo Bianconi <lorenzo@...nel.org>,
 Lukas Bulwahn <lukas.bulwahn@...hat.com>,
 Parthiban Veerasooran <Parthiban.Veerasooran@...rochip.com>,
 Geert Uytterhoeven <geert+renesas@...der.be>,
 Dust Li <dust.li@...ux.alibaba.com>
Subject: Re: [PATCH net-next v1] eea: Add basic driver framework for Alibaba Elastic Ethernet Adaptor

On Thu, 24 Jul 2025 19:42:52 +0200, Andrew Lunn <andrew@...n.ch> wrote:
> > @@ -202,5 +202,6 @@ source "drivers/net/ethernet/wangxun/Kconfig"
> >  source "drivers/net/ethernet/wiznet/Kconfig"
> >  source "drivers/net/ethernet/xilinx/Kconfig"
> >  source "drivers/net/ethernet/xircom/Kconfig"
> > +source "drivers/net/ethernet/alibaba/Kconfig"
>
> This file is sorted. Please insert in the correct location.
>
> > +struct eea_aq_host_info_cfg {
> > +#ifndef EEA_OS_DISTRO
> > +#define EEA_OS_DISTRO		0
> > +#endif
> > +
> > +#ifndef EEA_DRV_TYPE
> > +#define EEA_DRV_TYPE		0
> > +#endif
>
> Please remove these #ifndef. You know they are not defined, because
> they are not defined.
>
> > +int eea_adminq_config_host_info(struct eea_net *enet)
> > +{
> > +	struct device *dev = enet->edev->dma_dev;
> > +	struct eea_aq_host_info_cfg *cfg;
> > +	struct eea_aq_host_info_rep *rep;
> > +	int rc = -ENOMEM;
> > +
> > +	cfg = kzalloc(sizeof(*cfg), GFP_KERNEL);
> > +	if (!cfg)
> > +		return rc;
> > +
> > +	rep = kzalloc(sizeof(*rep), GFP_KERNEL);
> > +	if (!rep)
> > +		goto free_cfg;
> > +
> > +	cfg->os_type = EEA_OS_LINUX;
> > +	cfg->os_dist = EEA_OS_DISTRO;
> > +	cfg->drv_type = EEA_DRV_TYPE;
> > +
> > +	if (sscanf(utsname()->release, "%hu.%hu.%hu", &cfg->kern_ver_major,
> > +		   &cfg->kern_ver_minor, &cfg->kern_ver_sub_minor) != 3) {
> > +		rc = -EINVAL;
> > +		goto free_rep;
> > +	}
> > +
>
> LINUX_VERSION_MAJOR 6
> LINUX_VERSION_PATCHLEVEL 11
> LINUX_VERSION_SUBLEVEL 0
>
> I still think you should not be doing this.

Please give us a chance to try.


>
> > +static int eea_netdev_stop(struct net_device *netdev)
> > +{
> > +	struct eea_net *enet = netdev_priv(netdev);
> > +
> > +	if (!enet->started) {
> > +		netdev_warn(netdev, "eea netdev stop: but dev is not started.\n");
> > +		return 0;
> > +	}
>
> How does this happen?

This function can be called from other contexts.

When we receive an HA interrupt, it may indicate that there is an error inside
the device, and this function may be called as a result.


>
> > +static int eea_netdev_open(struct net_device *netdev)
> > +{
> > +	struct eea_net *enet = netdev_priv(netdev);
> > +	struct eea_net_tmp tmp = {};
> > +	int err;
> > +
> > +	if (enet->link_err) {
> > +		netdev_err(netdev, "netdev open err, because link error: %d\n",
> > +			   enet->link_err);
>
> What is a link error? You should be able to admin an interface up if
> the cable is not plugged in.


The device may send an interrupt to the driver, and then the driver can query
the device. If there is an error inside the device, the driver will stop
the device. If the user tries to bring up the network device, we will
attempt to prevent that operation.


>
> > +static int eea_netdev_init_features(struct net_device *netdev,
> > +				    struct eea_net *enet,
> > +				    struct eea_device *edev)
> > +{
> > +	eea_update_cfg(enet, edev, cfg);
>
> > +	netdev->min_mtu = ETH_MIN_MTU;
> > +
> > +	mtu = le16_to_cpu(cfg->mtu);
> > +	if (mtu < netdev->min_mtu) {
> > +		dev_err(edev->dma_dev, "device MTU too small. %d < %d", mtu,
> > +			netdev->min_mtu);
>
> This message does not really make it clear the firmware is completely
> broken, and passing invalid values to Linux. I think it is good not to
> trust the firmware, but when the firmware is broken, you probably
> should make that clear to the user.
>
> > +static int __eea_pci_probe(struct pci_dev *pci_dev,
> > +			   struct eea_pci_device *ep_dev);
> > +static void __eea_pci_remove(struct pci_dev *pci_dev, bool flush_ha_work);
>
> No forward declarations. Put the code in the correct order so they are
> not needed.

Here A calls B, B calls C, and C calls A. Please believe me, I don't like this
approach either, but this is the simplest way and allows us to keep the related
code together.


>
> > +static irqreturn_t eea_pci_ha_handle(int irq, void *data)
> > +{
> > +	struct eea_device *edev = data;
> > +
> > +	schedule_work(&edev->ep_dev->ha_handle_work);
> > +
> > +	return IRQ_HANDLED;
> > +}
>
> A threaded interrupt handler might make this simpler.

In our case, the HA interrupt is mainly used to notify some abnormal conditions
or errors from the device side. I think this is a low-probability event. My
understanding is that threaded interrupt hqandler needs to create a thread in
advance, which I consider to be a waste, so I didn't choose to do it this way.

Thanks.

>
> 	Andrew
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ