lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CO1PR11MB5089D1F871606F9BEF805B84D6D92@CO1PR11MB5089.namprd11.prod.outlook.com>
Date: Wed, 19 Mar 2025 18:42:28 +0000
From: "Keller, Jacob E" <jacob.e.keller@...el.com>
To: Jiri Pirko <jiri@...nulli.us>, "Kitszel, Przemyslaw"
	<przemyslaw.kitszel@...el.com>
CC: "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>,
	"davem@...emloft.net" <davem@...emloft.net>, "netdev@...r.kernel.org"
	<netdev@...r.kernel.org>, "Dumazet, Eric" <edumazet@...gle.com>,
	"kuba@...nel.org" <kuba@...nel.org>, "pabeni@...hat.com" <pabeni@...hat.com>,
	"saeedm@...dia.com" <saeedm@...dia.com>, "leon@...nel.org" <leon@...nel.org>,
	"tariqt@...dia.com" <tariqt@...dia.com>, "andrew+netdev@...n.ch"
	<andrew+netdev@...n.ch>, "dakr@...nel.org" <dakr@...nel.org>,
	"rafael@...nel.org" <rafael@...nel.org>, "Nguyen, Anthony L"
	<anthony.l.nguyen@...el.com>, "cratiu@...dia.com" <cratiu@...dia.com>,
	"Knitter, Konrad" <konrad.knitter@...el.com>, "cjubran@...dia.com"
	<cjubran@...dia.com>
Subject: RE: [PATCH net-next RFC 2/3] net/mlx5: Introduce shared devlink
 instance for PFs on same chip



> -----Original Message-----
> From: Jiri Pirko <jiri@...nulli.us>
> Sent: Wednesday, March 19, 2025 4:56 AM
> To: Kitszel, Przemyslaw <przemyslaw.kitszel@...el.com>
> Cc: Keller, Jacob E <jacob.e.keller@...el.com>; gregkh@...uxfoundation.org;
> davem@...emloft.net; netdev@...r.kernel.org; Dumazet, Eric
> <edumazet@...gle.com>; kuba@...nel.org; pabeni@...hat.com;
> saeedm@...dia.com; leon@...nel.org; tariqt@...dia.com;
> andrew+netdev@...n.ch; dakr@...nel.org; rafael@...nel.org; Nguyen, Anthony L
> <anthony.l.nguyen@...el.com>; cratiu@...dia.com; Knitter, Konrad
> <konrad.knitter@...el.com>; cjubran@...dia.com
> Subject: Re: [PATCH net-next RFC 2/3] net/mlx5: Introduce shared devlink instance
> for PFs on same chip
> 
> Wed, Mar 19, 2025 at 09:21:52AM +0100, przemyslaw.kitszel@...el.com wrote:
> >On 3/18/25 23:05, Keller, Jacob E wrote:
> >>
> >>
> >> > -----Original Message-----
> >> > From: Jiri Pirko <jiri@...nulli.us>
> >
> >> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/sh_devlink.c
> >> > @@ -0,0 +1,150 @@
> >> > +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
> >> > +/* Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights
> reserved. */
> >> > +
> >> > +#include <linux/device/faux.h>
> >> > +#include <linux/mlx5/driver.h>
> >> > +#include <linux/mlx5/vport.h>
> >> > +
> >> > +#include "sh_devlink.h"
> >> > +
> >> > +static LIST_HEAD(shd_list);
> >> > +static DEFINE_MUTEX(shd_mutex); /* Protects shd_list and shd->list */
> >
> >I essentially agree that faux_device could be used as-is, without any
> >devlink changes, works for me.
> >That does not remove the need to invent the name at some point ;)
> 
> What name? Name of faux device? Sure. In case of ice it's probably PCI DSN
> 
> >
> >we have resolved this in similar manner, that's fine, given my
> >understanding that you cannot let faux to dispatch for you, like:
> >faux_get_instance(serial_number_equivalent)
> 
> Not sure what you are asking TBH.
> 
> 
> >
> >> > +
> >> > +/* This structure represents a shared devlink instance,
> >> > + * there is one created for PF group of the same chip.
> >> > + */
> >> > +struct mlx5_shd {
> >> > +	/* Node in shd list */
> >> > +	struct list_head list;
> >> > +	/* Serial number of the chip */
> >> > +	const char *sn;
> >> > +	/* List of per-PF dev instances. */
> >> > +	struct list_head dev_list;
> >> > +	/* Related faux device */
> >> > +	struct faux_device *faux_dev;
> >> > +};
> >> > +
> >>
> >> For ice, the equivalent of this would essentially replace ice_adapter I imagine.
> >
> >or "ice_adapter will be the ice equivalent"
> 
> Oh yes, that makes sense.
> 
> 
> >
> >>
> >> > +static const struct devlink_ops mlx5_shd_ops = {
> >
> >please double check if there is no crash for:
> >$ devlink dev info the/faux/thing
> 
> Will do, but why do you think so?
> 
> 
> >
> >> > +};
> >> > +
> >> > +static int mlx5_shd_faux_probe(struct faux_device *faux_dev)
> >> > +{
> >> > +	struct devlink *devlink;
> >> > +	struct mlx5_shd *shd;
> >> > +
> >> > +	devlink = devlink_alloc(&mlx5_shd_ops, sizeof(struct mlx5_shd),
> >
> >sizeof(*shd)
> 
> Sure.
> 
> 
> >
> >I like that you reuse devlink_alloc(), with allocation of priv data,
> >that suits also our needs
> >
> >> > &faux_dev->dev);
> >> > +	if (!devlink)
> >> > +		return -ENOMEM;
> >> > +	shd = devlink_priv(devlink);
> >> > +	faux_device_set_drvdata(faux_dev, shd);
> >> > +
> >> > +	devl_lock(devlink);
> >> > +	devl_register(devlink);
> >> > +	devl_unlock(devlink);
> >> > +	return 0;
> >> > +}
> >
> >[...]
> >
> >> > +int mlx5_shd_init(struct mlx5_core_dev *dev)
> >> > +{
> >> > +	u8 *vpd_data __free(kfree) = NULL;
> >
> >so bad that netdev mainainers discourage __free() :(
> >perhaps I should propose higher abstraction wrapper for it
> >on April 1st
> >
> >> > +	struct pci_dev *pdev = dev->pdev;
> >> > +	unsigned int vpd_size, kw_len;
> >> > +	struct mlx5_shd *shd;
> >> > +	const char *sn;
> >
> >I would extract name retrieval, perhaps mlx5_shd_get_name()?
> 
> I had that, did not really make the code nicer :)
> 
> 
> >
> >> > +	char *end;
> >> > +	int start;
> >> > +	int err;
> >> > +
> >> > +	if (!mlx5_core_is_pf(dev))
> >> > +		return 0;
> >> > +
> >> > +	vpd_data = pci_vpd_alloc(pdev, &vpd_size);
> >> > +	if (IS_ERR(vpd_data)) {
> >> > +		err = PTR_ERR(vpd_data);
> >> > +		return err == -ENODEV ? 0 : err;
> >
> >what? that means the shared devlink instance is something you will
> >work properly without?
> 
> Not sure. This is something that should not happen for any existing
> device.
> 
> 
> >
> >> > +	}
> >> > +	start = pci_vpd_find_ro_info_keyword(vpd_data, vpd_size, "V3",
> >> > &kw_len);
> >> > +	if (start < 0) {
> >> > +		/* Fall-back to SN for older devices. */
> >> > +		start = pci_vpd_find_ro_info_keyword(vpd_data, vpd_size,
> >> > +
> >> > PCI_VPD_RO_KEYWORD_SERIALNO, &kw_len);
> >> > +		if (start < 0)
> >> > +			return -ENOENT;
> >> > +	}
> >> > +	sn = kstrndup(vpd_data + start, kw_len, GFP_KERNEL);
> >> > +	if (!sn)
> >> > +		return -ENOMEM;
> >> > +	end = strchrnul(sn, ' ');
> >> > +	*end = '\0';
> >> > +
> >> > +	guard(mutex)(&shd_mutex);
> >
> >guard()() is a no-no too, per "discouraged by netdev maintainers",
> >and here I'm on board with discouraging ;)
> 
> That's a fight with windmills. It will happen sooner than later anyway.
> It is just too conventient. I don't understand why netdev has to have
> special treat comparing to the rest of the kernel all the time...
> 
> 
> >
> >> > +	list_for_each_entry(shd, &shd_list, list) {
> >> > +		if (!strcmp(shd->sn, sn)) {
> >> > +			kfree(sn);
> >> > +			goto found;
> >> > +		}
> >> > +	}
> >> > +	shd = mlx5_shd_create(sn);
> >> > +	if (!shd) {
> >> > +		kfree(sn);
> >> > +		return -ENOMEM;
> >> > +	}
> >>
> >> How is the faux device kept in memory? I guess its reference counted
> >> somewhere?
> >
> >get_device()/put_device() with faxu_dev->dev as argument
> >
> >But I don't see that reference being incremented in the list_for_each.
> >
> >Jiri keeps "the counter" as the implicit observation of shd list size :)
> >which is protected by mutex
> 
> Yep. Why isn't that enough? I need the list anyway. Plus, I'm using the
> list to reference count shd, not the fauxdev. Fauxdev is explicitly
> create/destroyed by calling appropriate function. I belive that is the
> correct way (maybe the only way?) to instantiate/deinstantiate faux.

Re-using the list as the count for how many users seems reasonable. I just didn't catch that's  how you did it.

Thanks,
Jake

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ