netdev - Re: [PATCH net-next RFC 2/3] net/mlx5: Introduce shared devlink instance for PFs on same chip

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <3be26dca-3230-4fd6-8421-652f95c72163@intel.com>
Date: Wed, 19 Mar 2025 09:21:52 +0100
From: Przemek Kitszel <przemyslaw.kitszel@...el.com>
To: "Keller, Jacob E" <jacob.e.keller@...el.com>, Jiri Pirko
	<jiri@...nulli.us>, "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>
CC: "davem@...emloft.net" <davem@...emloft.net>, "netdev@...r.kernel.org"
	<netdev@...r.kernel.org>, "Dumazet, Eric" <edumazet@...gle.com>,
	"kuba@...nel.org" <kuba@...nel.org>, "pabeni@...hat.com" <pabeni@...hat.com>,
	"saeedm@...dia.com" <saeedm@...dia.com>, "leon@...nel.org" <leon@...nel.org>,
	"tariqt@...dia.com" <tariqt@...dia.com>, "andrew+netdev@...n.ch"
	<andrew+netdev@...n.ch>, "dakr@...nel.org" <dakr@...nel.org>,
	"rafael@...nel.org" <rafael@...nel.org>, "Nguyen, Anthony L"
	<anthony.l.nguyen@...el.com>, "cratiu@...dia.com" <cratiu@...dia.com>,
	"Knitter, Konrad" <konrad.knitter@...el.com>, "cjubran@...dia.com"
	<cjubran@...dia.com>
Subject: Re: [PATCH net-next RFC 2/3] net/mlx5: Introduce shared devlink
 instance for PFs on same chip

On 3/18/25 23:05, Keller, Jacob E wrote:
> 
> 
>> -----Original Message-----
>> From: Jiri Pirko <jiri@...nulli.us>

>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/sh_devlink.c
>> @@ -0,0 +1,150 @@
>> +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
>> +/* Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */
>> +
>> +#include <linux/device/faux.h>
>> +#include <linux/mlx5/driver.h>
>> +#include <linux/mlx5/vport.h>
>> +
>> +#include "sh_devlink.h"
>> +
>> +static LIST_HEAD(shd_list);
>> +static DEFINE_MUTEX(shd_mutex); /* Protects shd_list and shd->list */

I essentially agree that faux_device could be used as-is, without any
devlink changes, works for me.
That does not remove the need to invent the name at some point ;)

we have resolved this in similar manner, that's fine, given my
understanding that you cannot let faux to dispatch for you, like:
faux_get_instance(serial_number_equivalent)

>> +
>> +/* This structure represents a shared devlink instance,
>> + * there is one created for PF group of the same chip.
>> + */
>> +struct mlx5_shd {
>> +	/* Node in shd list */
>> +	struct list_head list;
>> +	/* Serial number of the chip */
>> +	const char *sn;
>> +	/* List of per-PF dev instances. */
>> +	struct list_head dev_list;
>> +	/* Related faux device */
>> +	struct faux_device *faux_dev;
>> +};
>> +
> 
> For ice, the equivalent of this would essentially replace ice_adapter I imagine.

or "ice_adapter will be the ice equivalent"

> 
>> +static const struct devlink_ops mlx5_shd_ops = {

please double check if there is no crash for:
$ devlink dev info the/faux/thing

>> +};
>> +
>> +static int mlx5_shd_faux_probe(struct faux_device *faux_dev)
>> +{
>> +	struct devlink *devlink;
>> +	struct mlx5_shd *shd;
>> +
>> +	devlink = devlink_alloc(&mlx5_shd_ops, sizeof(struct mlx5_shd),

sizeof(*shd)

I like that you reuse devlink_alloc(), with allocation of priv data,
that suits also our needs

>> &faux_dev->dev);
>> +	if (!devlink)
>> +		return -ENOMEM;
>> +	shd = devlink_priv(devlink);
>> +	faux_device_set_drvdata(faux_dev, shd);
>> +
>> +	devl_lock(devlink);
>> +	devl_register(devlink);
>> +	devl_unlock(devlink);
>> +	return 0;
>> +}

[...]

>> +int mlx5_shd_init(struct mlx5_core_dev *dev)
>> +{
>> +	u8 *vpd_data __free(kfree) = NULL;

so bad that netdev mainainers discourage __free() :(
perhaps I should propose higher abstraction wrapper for it
on April 1st

>> +	struct pci_dev *pdev = dev->pdev;
>> +	unsigned int vpd_size, kw_len;
>> +	struct mlx5_shd *shd;
>> +	const char *sn;

I would extract name retrieval, perhaps mlx5_shd_get_name()?

>> +	char *end;
>> +	int start;
>> +	int err;
>> +
>> +	if (!mlx5_core_is_pf(dev))
>> +		return 0;
>> +
>> +	vpd_data = pci_vpd_alloc(pdev, &vpd_size);
>> +	if (IS_ERR(vpd_data)) {
>> +		err = PTR_ERR(vpd_data);
>> +		return err == -ENODEV ? 0 : err;

what? that means the shared devlink instance is something you will
work properly without?

>> +	}
>> +	start = pci_vpd_find_ro_info_keyword(vpd_data, vpd_size, "V3",
>> &kw_len);
>> +	if (start < 0) {
>> +		/* Fall-back to SN for older devices. */
>> +		start = pci_vpd_find_ro_info_keyword(vpd_data, vpd_size,
>> +
>> PCI_VPD_RO_KEYWORD_SERIALNO, &kw_len);
>> +		if (start < 0)
>> +			return -ENOENT;
>> +	}
>> +	sn = kstrndup(vpd_data + start, kw_len, GFP_KERNEL);
>> +	if (!sn)
>> +		return -ENOMEM;
>> +	end = strchrnul(sn, ' ');
>> +	*end = '\0';
>> +
>> +	guard(mutex)(&shd_mutex);

guard()() is a no-no too, per "discouraged by netdev maintainers",
and here I'm on board with discouraging ;)

>> +	list_for_each_entry(shd, &shd_list, list) {
>> +		if (!strcmp(shd->sn, sn)) {
>> +			kfree(sn);
>> +			goto found;
>> +		}
>> +	}
>> +	shd = mlx5_shd_create(sn);
>> +	if (!shd) {
>> +		kfree(sn);
>> +		return -ENOMEM;
>> +	}
> 
> How is the faux device kept in memory? I guess its reference counted somewhere? 

get_device()/put_device() with faxu_dev->dev as argument

But I don't see that reference being incremented in the list_for_each.

Jiri keeps "the counter" as the implicit observation of shd list size :)
which is protected by mutex