lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ybrtz77i3hbxdwau4k55xn5brsnrtyomg6u65eyqm4fh7nsnob@arqyloer2l5z>
Date: Mon, 24 Feb 2025 17:14:24 +0100
From: Jiri Pirko <jiri@...nulli.us>
To: Przemek Kitszel <przemyslaw.kitszel@...el.com>
Cc: intel-wired-lan@...ts.osuosl.org, 
	Tony Nguyen <anthony.l.nguyen@...el.com>, Jakub Kicinski <kuba@...nel.org>, 
	Cosmin Ratiu <cratiu@...dia.com>, Tariq Toukan <tariqt@...dia.com>, netdev@...r.kernel.org, 
	Konrad Knitter <konrad.knitter@...el.com>, Jacob Keller <jacob.e.keller@...el.com>, davem@...emloft.net, 
	Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>, Andrew Lunn <andrew@...n.ch>, 
	linux-kernel@...r.kernel.org, ITP Upstream <nxne.cnse.osdt.itp.upstreaming@...el.com>, 
	Carolina Jubran <cjubran@...dia.com>
Subject: Re: [RFC net-next v2 1/2] devlink: add whole device devlink instance

Wed, Feb 19, 2025 at 05:32:54PM +0100, przemyslaw.kitszel@...el.com wrote:
>Add a support for whole device devlink instance. Intented as a entity
>over all PF devices on given physical device.
>
>In case of ice driver we have multiple PF devices (with their devlink
>dev representation), that have separate drivers loaded. However those
>still do share lots of resources due to being the on same HW. Examples
>include PTP clock and RSS LUT. Historically such stuff was assigned to
>PF0, but that was both not clear and not working well. Now such stuff
>is moved to be covered into struct ice_adapter, there is just one instance
>of such per HW.
>
>This patch adds a devlink instance that corresponds to that ice_adapter,
>to allow arbitrage over resources (as RSS LUT) via it (further in the
>series (RFC NOTE: stripped out so far)).
>
>Thanks to Wojciech Drewek for very nice naming of the devlink instance:
>PF0:		pci/0000:00:18.0
>whole-dev:	pci/0000:00:18
>But I made this a param for now (driver is free to pass just "whole-dev").
>
>$ devlink dev # (Interesting part of output only)
>pci/0000:af:00:
>  nested_devlink:
>    pci/0000:af:00.0
>    pci/0000:af:00.1
>    pci/0000:af:00.2
>    pci/0000:af:00.3
>    pci/0000:af:00.4
>    pci/0000:af:00.5
>    pci/0000:af:00.6
>    pci/0000:af:00.7


In general, I like this approach. In fact, I have quite similar
patch/set in my sandbox git.

The problem I didn't figure out how to handle, was a backing entity
for the parent devlink.

You use part of PCI BDF, which is obviously wrong:
1) bus_name/dev_name the user expects to be the backing device bus and
   address on it (pci/usb/i2c). With using part of BDF, you break this
   assumption.
2) 2 PFs can have totally different BDF (in VM for example). Then your
   approach is broken.

I was thinking about having an auxiliary device created for the parent,
but auxiliary assumes it is child. The is upside-down.

I was thinking about having some sort of made-up per-driver bus, like
"ice" of "mlx5" with some thing like DSN that would act as a "dev_name".
I have a patch that introduces:

struct devlink_shared_inst;

struct devlink *devlink_shared_alloc(const struct devlink_ops *ops,
                                     size_t priv_size, struct net *net,
                                     struct module *module, u64 per_module_id,
                                     void *inst_priv,
                                     struct devlink_shared_inst **p_inst);
void devlink_shared_free(struct devlink *devlink,
                        struct devlink_shared_inst *inst);

I took a stab at it here:
https://github.com/jpirko/linux_mlxsw/commits/wip_dl_pfs_parent/
The work is not finished.


Also, I was thinking about having some made-up bus, like "pci_ids",
where instead of BDFs as addresses, there would be DSN for example.

None of these 3 is nice. 

The shared parent entity for PFs (and other Fs) is always reference
counted, first creates, last removes. I feel like this is something
missing in PCI spec. If such beast would exist, very easy to implement
this in devlink. We have all we need in place already.



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ