[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <zzyls3te4he2l5spf4wzfb53imuoemopwl774dzq5t5s22sg7l@37fk7fvgvnrr>
Date: Tue, 25 Feb 2025 15:35:09 +0100
From: Jiri Pirko <jiri@...nulli.us>
To: Przemek Kitszel <przemyslaw.kitszel@...el.com>
Cc: intel-wired-lan@...ts.osuosl.org,
Tony Nguyen <anthony.l.nguyen@...el.com>, Jakub Kicinski <kuba@...nel.org>,
Cosmin Ratiu <cratiu@...dia.com>, Tariq Toukan <tariqt@...dia.com>, netdev@...r.kernel.org,
Konrad Knitter <konrad.knitter@...el.com>, Jacob Keller <jacob.e.keller@...el.com>, davem@...emloft.net,
Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>, Andrew Lunn <andrew@...n.ch>,
linux-kernel@...r.kernel.org, ITP Upstream <nxne.cnse.osdt.itp.upstreaming@...el.com>,
Carolina Jubran <cjubran@...dia.com>
Subject: Re: [RFC net-next v2 1/2] devlink: add whole device devlink instance
Tue, Feb 25, 2025 at 12:30:49PM +0100, przemyslaw.kitszel@...el.com wrote:
>
>> > Thanks to Wojciech Drewek for very nice naming of the devlink instance:
>> > PF0: pci/0000:00:18.0
>> > whole-dev: pci/0000:00:18
>> > But I made this a param for now (driver is free to pass just "whole-dev").
>> >
>> > $ devlink dev # (Interesting part of output only)
>> > pci/0000:af:00:
>> > nested_devlink:
>> > pci/0000:af:00.0
>> > pci/0000:af:00.1
>> > pci/0000:af:00.2
>> > pci/0000:af:00.3
>> > pci/0000:af:00.4
>> > pci/0000:af:00.5
>> > pci/0000:af:00.6
>> > pci/0000:af:00.7
>>
>>
>> In general, I like this approach. In fact, I have quite similar
>> patch/set in my sandbox git.
>>
>> The problem I didn't figure out how to handle, was a backing entity
>> for the parent devlink.
>>
>> You use part of PCI BDF, which is obviously wrong:
>> 1) bus_name/dev_name the user expects to be the backing device bus and
>> address on it (pci/usb/i2c). With using part of BDF, you break this
>> assumption.
>> 2) 2 PFs can have totally different BDF (in VM for example). Then your
>> approach is broken.
>
>To make the hard part of it easy, I like to have the name to be provided
>by what the PF/driver has available (whichever will be the first of
>given device PFs), as of now, we resolve this issue (and provide ~what
>your devlink_shared does) via ice_adapter.
I don't understand. Can you provide some examples please?
>
>Making it a devlink instance gives user an easy way to see the whole
>picture of all resources handled as "shared per device", my current
>output, for all PFs and VFs on given device:
>
>pci/0000:af:00:
> name rss size 8 unit entry size_min 0 size_max 24 size_gran 1
> resources:
> name lut_512 size 0 unit entry size_min 0 size_max 16 size_gran 1
> name lut_2048 size 8 unit entry size_min 0 size_max 8 size_gran 1
>
>What is contributing to the hardness, this is not just one for all ice
>PFs, but one per device, which we distinguish via pci BDF.
How?
>
>>
>> I was thinking about having an auxiliary device created for the parent,
>> but auxiliary assumes it is child. The is upside-down.
>>
>> I was thinking about having some sort of made-up per-driver bus, like
>> "ice" of "mlx5" with some thing like DSN that would act as a "dev_name".
>> I have a patch that introduces:
>>
>> struct devlink_shared_inst;
>>
>> struct devlink *devlink_shared_alloc(const struct devlink_ops *ops,
>> size_t priv_size, struct net *net,
>> struct module *module, u64 per_module_id,
>> void *inst_priv,
>> struct devlink_shared_inst **p_inst);
>> void devlink_shared_free(struct devlink *devlink,
>> struct devlink_shared_inst *inst);
>>
>> I took a stab at it here:
>> https://github.com/jpirko/linux_mlxsw/commits/wip_dl_pfs_parent/
>> The work is not finished.
>>
>>
>> Also, I was thinking about having some made-up bus, like "pci_ids",
>> where instead of BDFs as addresses, there would be DSN for example.
>>
>> None of these 3 is nice.
>
>how one would invent/infer/allocate the DSN?
Driver knows DSN, it can obtain from pci layer.
>
>faux_bus mentioned by Jake would be about the same level of "fakeness"
>as simply allocating a new instance of devlink by the first PF, IMO :)
Hmm, briefly looking at faux, this looks like fills the gap I missed in
auxdev. Will try to use it in my patchset.
Thanks!
Powered by blists - more mailing lists