[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <emj5f7xfdcnkemdairmpyiqmq5aoj2uqr7bxhfiezqm4zeuchu@wuplknrtviud>
Date: Wed, 19 Mar 2025 12:56:04 +0100
From: Jiri Pirko <jiri@...nulli.us>
To: Przemek Kitszel <przemyslaw.kitszel@...el.com>
Cc: "Keller, Jacob E" <jacob.e.keller@...el.com>,
"gregkh@...uxfoundation.org" <gregkh@...uxfoundation.org>, "davem@...emloft.net" <davem@...emloft.net>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>, "Dumazet, Eric" <edumazet@...gle.com>,
"kuba@...nel.org" <kuba@...nel.org>, "pabeni@...hat.com" <pabeni@...hat.com>,
"saeedm@...dia.com" <saeedm@...dia.com>, "leon@...nel.org" <leon@...nel.org>,
"tariqt@...dia.com" <tariqt@...dia.com>, "andrew+netdev@...n.ch" <andrew+netdev@...n.ch>,
"dakr@...nel.org" <dakr@...nel.org>, "rafael@...nel.org" <rafael@...nel.org>,
"Nguyen, Anthony L" <anthony.l.nguyen@...el.com>, "cratiu@...dia.com" <cratiu@...dia.com>,
"Knitter, Konrad" <konrad.knitter@...el.com>, "cjubran@...dia.com" <cjubran@...dia.com>
Subject: Re: [PATCH net-next RFC 2/3] net/mlx5: Introduce shared devlink
instance for PFs on same chip
Wed, Mar 19, 2025 at 09:21:52AM +0100, przemyslaw.kitszel@...el.com wrote:
>On 3/18/25 23:05, Keller, Jacob E wrote:
>>
>>
>> > -----Original Message-----
>> > From: Jiri Pirko <jiri@...nulli.us>
>
>> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/sh_devlink.c
>> > @@ -0,0 +1,150 @@
>> > +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
>> > +/* Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */
>> > +
>> > +#include <linux/device/faux.h>
>> > +#include <linux/mlx5/driver.h>
>> > +#include <linux/mlx5/vport.h>
>> > +
>> > +#include "sh_devlink.h"
>> > +
>> > +static LIST_HEAD(shd_list);
>> > +static DEFINE_MUTEX(shd_mutex); /* Protects shd_list and shd->list */
>
>I essentially agree that faux_device could be used as-is, without any
>devlink changes, works for me.
>That does not remove the need to invent the name at some point ;)
What name? Name of faux device? Sure. In case of ice it's probably PCI DSN
>
>we have resolved this in similar manner, that's fine, given my
>understanding that you cannot let faux to dispatch for you, like:
>faux_get_instance(serial_number_equivalent)
Not sure what you are asking TBH.
>
>> > +
>> > +/* This structure represents a shared devlink instance,
>> > + * there is one created for PF group of the same chip.
>> > + */
>> > +struct mlx5_shd {
>> > + /* Node in shd list */
>> > + struct list_head list;
>> > + /* Serial number of the chip */
>> > + const char *sn;
>> > + /* List of per-PF dev instances. */
>> > + struct list_head dev_list;
>> > + /* Related faux device */
>> > + struct faux_device *faux_dev;
>> > +};
>> > +
>>
>> For ice, the equivalent of this would essentially replace ice_adapter I imagine.
>
>or "ice_adapter will be the ice equivalent"
Oh yes, that makes sense.
>
>>
>> > +static const struct devlink_ops mlx5_shd_ops = {
>
>please double check if there is no crash for:
>$ devlink dev info the/faux/thing
Will do, but why do you think so?
>
>> > +};
>> > +
>> > +static int mlx5_shd_faux_probe(struct faux_device *faux_dev)
>> > +{
>> > + struct devlink *devlink;
>> > + struct mlx5_shd *shd;
>> > +
>> > + devlink = devlink_alloc(&mlx5_shd_ops, sizeof(struct mlx5_shd),
>
>sizeof(*shd)
Sure.
>
>I like that you reuse devlink_alloc(), with allocation of priv data,
>that suits also our needs
>
>> > &faux_dev->dev);
>> > + if (!devlink)
>> > + return -ENOMEM;
>> > + shd = devlink_priv(devlink);
>> > + faux_device_set_drvdata(faux_dev, shd);
>> > +
>> > + devl_lock(devlink);
>> > + devl_register(devlink);
>> > + devl_unlock(devlink);
>> > + return 0;
>> > +}
>
>[...]
>
>> > +int mlx5_shd_init(struct mlx5_core_dev *dev)
>> > +{
>> > + u8 *vpd_data __free(kfree) = NULL;
>
>so bad that netdev mainainers discourage __free() :(
>perhaps I should propose higher abstraction wrapper for it
>on April 1st
>
>> > + struct pci_dev *pdev = dev->pdev;
>> > + unsigned int vpd_size, kw_len;
>> > + struct mlx5_shd *shd;
>> > + const char *sn;
>
>I would extract name retrieval, perhaps mlx5_shd_get_name()?
I had that, did not really make the code nicer :)
>
>> > + char *end;
>> > + int start;
>> > + int err;
>> > +
>> > + if (!mlx5_core_is_pf(dev))
>> > + return 0;
>> > +
>> > + vpd_data = pci_vpd_alloc(pdev, &vpd_size);
>> > + if (IS_ERR(vpd_data)) {
>> > + err = PTR_ERR(vpd_data);
>> > + return err == -ENODEV ? 0 : err;
>
>what? that means the shared devlink instance is something you will
>work properly without?
Not sure. This is something that should not happen for any existing
device.
>
>> > + }
>> > + start = pci_vpd_find_ro_info_keyword(vpd_data, vpd_size, "V3",
>> > &kw_len);
>> > + if (start < 0) {
>> > + /* Fall-back to SN for older devices. */
>> > + start = pci_vpd_find_ro_info_keyword(vpd_data, vpd_size,
>> > +
>> > PCI_VPD_RO_KEYWORD_SERIALNO, &kw_len);
>> > + if (start < 0)
>> > + return -ENOENT;
>> > + }
>> > + sn = kstrndup(vpd_data + start, kw_len, GFP_KERNEL);
>> > + if (!sn)
>> > + return -ENOMEM;
>> > + end = strchrnul(sn, ' ');
>> > + *end = '\0';
>> > +
>> > + guard(mutex)(&shd_mutex);
>
>guard()() is a no-no too, per "discouraged by netdev maintainers",
>and here I'm on board with discouraging ;)
That's a fight with windmills. It will happen sooner than later anyway.
It is just too conventient. I don't understand why netdev has to have
special treat comparing to the rest of the kernel all the time...
>
>> > + list_for_each_entry(shd, &shd_list, list) {
>> > + if (!strcmp(shd->sn, sn)) {
>> > + kfree(sn);
>> > + goto found;
>> > + }
>> > + }
>> > + shd = mlx5_shd_create(sn);
>> > + if (!shd) {
>> > + kfree(sn);
>> > + return -ENOMEM;
>> > + }
>>
>> How is the faux device kept in memory? I guess its reference counted
>> somewhere?
>
>get_device()/put_device() with faxu_dev->dev as argument
>
>But I don't see that reference being incremented in the list_for_each.
>
>Jiri keeps "the counter" as the implicit observation of shd list size :)
>which is protected by mutex
Yep. Why isn't that enough? I need the list anyway. Plus, I'm using the
list to reference count shd, not the fauxdev. Fauxdev is explicitly
create/destroyed by calling appropriate function. I belive that is the
correct way (maybe the only way?) to instantiate/deinstantiate faux.
Powered by blists - more mailing lists