[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3be26dca-3230-4fd6-8421-652f95c72163@intel.com>
Date: Wed, 19 Mar 2025 09:21:52 +0100
From: Przemek Kitszel <przemyslaw.kitszel@...el.com>
To: "Keller, Jacob E" <jacob.e.keller@...el.com>, Jiri Pirko
<jiri@...nulli.us>, "gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>
CC: "davem@...emloft.net" <davem@...emloft.net>, "netdev@...r.kernel.org"
<netdev@...r.kernel.org>, "Dumazet, Eric" <edumazet@...gle.com>,
"kuba@...nel.org" <kuba@...nel.org>, "pabeni@...hat.com" <pabeni@...hat.com>,
"saeedm@...dia.com" <saeedm@...dia.com>, "leon@...nel.org" <leon@...nel.org>,
"tariqt@...dia.com" <tariqt@...dia.com>, "andrew+netdev@...n.ch"
<andrew+netdev@...n.ch>, "dakr@...nel.org" <dakr@...nel.org>,
"rafael@...nel.org" <rafael@...nel.org>, "Nguyen, Anthony L"
<anthony.l.nguyen@...el.com>, "cratiu@...dia.com" <cratiu@...dia.com>,
"Knitter, Konrad" <konrad.knitter@...el.com>, "cjubran@...dia.com"
<cjubran@...dia.com>
Subject: Re: [PATCH net-next RFC 2/3] net/mlx5: Introduce shared devlink
instance for PFs on same chip
On 3/18/25 23:05, Keller, Jacob E wrote:
>
>
>> -----Original Message-----
>> From: Jiri Pirko <jiri@...nulli.us>
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/sh_devlink.c
>> @@ -0,0 +1,150 @@
>> +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
>> +/* Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */
>> +
>> +#include <linux/device/faux.h>
>> +#include <linux/mlx5/driver.h>
>> +#include <linux/mlx5/vport.h>
>> +
>> +#include "sh_devlink.h"
>> +
>> +static LIST_HEAD(shd_list);
>> +static DEFINE_MUTEX(shd_mutex); /* Protects shd_list and shd->list */
I essentially agree that faux_device could be used as-is, without any
devlink changes, works for me.
That does not remove the need to invent the name at some point ;)
we have resolved this in similar manner, that's fine, given my
understanding that you cannot let faux to dispatch for you, like:
faux_get_instance(serial_number_equivalent)
>> +
>> +/* This structure represents a shared devlink instance,
>> + * there is one created for PF group of the same chip.
>> + */
>> +struct mlx5_shd {
>> + /* Node in shd list */
>> + struct list_head list;
>> + /* Serial number of the chip */
>> + const char *sn;
>> + /* List of per-PF dev instances. */
>> + struct list_head dev_list;
>> + /* Related faux device */
>> + struct faux_device *faux_dev;
>> +};
>> +
>
> For ice, the equivalent of this would essentially replace ice_adapter I imagine.
or "ice_adapter will be the ice equivalent"
>
>> +static const struct devlink_ops mlx5_shd_ops = {
please double check if there is no crash for:
$ devlink dev info the/faux/thing
>> +};
>> +
>> +static int mlx5_shd_faux_probe(struct faux_device *faux_dev)
>> +{
>> + struct devlink *devlink;
>> + struct mlx5_shd *shd;
>> +
>> + devlink = devlink_alloc(&mlx5_shd_ops, sizeof(struct mlx5_shd),
sizeof(*shd)
I like that you reuse devlink_alloc(), with allocation of priv data,
that suits also our needs
>> &faux_dev->dev);
>> + if (!devlink)
>> + return -ENOMEM;
>> + shd = devlink_priv(devlink);
>> + faux_device_set_drvdata(faux_dev, shd);
>> +
>> + devl_lock(devlink);
>> + devl_register(devlink);
>> + devl_unlock(devlink);
>> + return 0;
>> +}
[...]
>> +int mlx5_shd_init(struct mlx5_core_dev *dev)
>> +{
>> + u8 *vpd_data __free(kfree) = NULL;
so bad that netdev mainainers discourage __free() :(
perhaps I should propose higher abstraction wrapper for it
on April 1st
>> + struct pci_dev *pdev = dev->pdev;
>> + unsigned int vpd_size, kw_len;
>> + struct mlx5_shd *shd;
>> + const char *sn;
I would extract name retrieval, perhaps mlx5_shd_get_name()?
>> + char *end;
>> + int start;
>> + int err;
>> +
>> + if (!mlx5_core_is_pf(dev))
>> + return 0;
>> +
>> + vpd_data = pci_vpd_alloc(pdev, &vpd_size);
>> + if (IS_ERR(vpd_data)) {
>> + err = PTR_ERR(vpd_data);
>> + return err == -ENODEV ? 0 : err;
what? that means the shared devlink instance is something you will
work properly without?
>> + }
>> + start = pci_vpd_find_ro_info_keyword(vpd_data, vpd_size, "V3",
>> &kw_len);
>> + if (start < 0) {
>> + /* Fall-back to SN for older devices. */
>> + start = pci_vpd_find_ro_info_keyword(vpd_data, vpd_size,
>> +
>> PCI_VPD_RO_KEYWORD_SERIALNO, &kw_len);
>> + if (start < 0)
>> + return -ENOENT;
>> + }
>> + sn = kstrndup(vpd_data + start, kw_len, GFP_KERNEL);
>> + if (!sn)
>> + return -ENOMEM;
>> + end = strchrnul(sn, ' ');
>> + *end = '\0';
>> +
>> + guard(mutex)(&shd_mutex);
guard()() is a no-no too, per "discouraged by netdev maintainers",
and here I'm on board with discouraging ;)
>> + list_for_each_entry(shd, &shd_list, list) {
>> + if (!strcmp(shd->sn, sn)) {
>> + kfree(sn);
>> + goto found;
>> + }
>> + }
>> + shd = mlx5_shd_create(sn);
>> + if (!shd) {
>> + kfree(sn);
>> + return -ENOMEM;
>> + }
>
> How is the faux device kept in memory? I guess its reference counted somewhere?
get_device()/put_device() with faxu_dev->dev as argument
But I don't see that reference being incremented in the list_for_each.
Jiri keeps "the counter" as the implicit observation of shd list size :)
which is protected by mutex
Powered by blists - more mailing lists