netdev - Re: [RFC net-next 0/5] devlink: Add unique identifier to devlink port function

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <bee1e240-cc6a-4c30-a2ae-6f7974627053@nvidia.com>
Date: Thu, 8 May 2025 12:04:22 +0300
From: Mark Bloch <mbloch@...dia.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Moshe Shemesh <moshe@...dia.com>, netdev@...r.kernel.org,
 "David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
 Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>,
 Donald Hunter <donald.hunter@...il.com>, Jiri Pirko <jiri@...nulli.us>,
 Jonathan Corbet <corbet@....net>, Andrew Lunn <andrew+netdev@...n.ch>,
 Tariq Toukan <tariqt@...dia.com>
Subject: Re: [RFC net-next 0/5] devlink: Add unique identifier to devlink port
 function

On 08/05/2025 3:43, Jakub Kicinski wrote:
> On Tue, 6 May 2025 18:34:22 +0300 Mark Bloch wrote:
>>>> Flow:
>>>> 1. A user requests a container with networking connectivity.
>>>> 2. Kubernetes allocates a VF on host X. An agent on the host handles VF
>>>>    configuration and sends the PF number and VF index to the central
>>>>    management software.  
>>>
>>> What is "central management software" here? Deployment specific or
>>> some part of k8s?  
>>
>> It's the k8s API server.
>>
>>>   
>>>> 3. An agent on the DPU side detects the changes made on host X. Using
>>>>    the PF number and VF index, it identifies the corresponding
>>>>    representor, attaches it to an OVS bridge, and allows OVN to program
>>>>    the relevant steering rules.  
>>>
>>> What does it mean that DPU "detects it", what's the source and 
>>> mechanism of the notification?
>>> Is it communicating with the central SW during  the process?  
>>
>> The agent (running in the ARM/DPU) listens for events from the k8s API server.
> 
> Interesting. So a deployment with no security boundaries. The internals
> of the IPU and the k8s on the host are in the same domain of control.

The VF is created on host X, but the corresponding representor appears
on a different host, the IPU. Naturally, they need to be able to
synchronize and exchange information for everything to work correctly.

> 
> So how does the user remotely power cycle the hosts?

Why should a user be able to power cycle the hosts?
Are you are asking about the administrator?

> 
> What I'm getting at is that your mental model seems to be missing any
> sort of HW inventory database, which lists all the hosts and how they
> plug into the DC. The administrator of the system must already know
> where each machine is exactly in the chassis for basic DC ops. And
> that HW DB is normally queried in what you describe. If there is any
> security domain crossing in the picture it will require cross checking
> against that HW DB.

You're assuming that external host numbering and PCI enumeration are
stable, also users can determine the mapping only after creating
VFs. But even then, the mapping is indirect e.g: “I created a VF on
this PF, and I see a single representor appear on the IPU, so they
must be linked.” That approach is fragile and error prone.

Also, keep in mind: the external hosts and their kernels shouldn’t
be aware they’re part of a multi-host system. With our current
approach, you just need to provide a host-to-IPU mapping
upfront, no guesswork involved.

Just thinking out loud, once this feature is in place, we might
not even need a static mapping between external hosts and IPU hosts.

If VUID and FUID are globally unique, the following workflow
becomes possible:

- A user requests a container with network connectivity.
- k8s allocates and configures a VF on one of the hosts.
  It then sends the VUID, PF number, and VF index for the new VF
  to the k8S API server.
- Somewhere in the network, a representor appears. An agent detects
  this and notifies the k8s API server, including its FUID,
  PF number, and VF index.
- The API server matches the VF and representor data based on the
  globally unique identifiers and sends the relevant information
  back to the agent that reported the representor creation.
- The agent attaches the representor to the OVS bridge, and with
  OVN configures the appropriate steering rules.

This would remove the need for pre defined host to IPU mappings
and allow for a more dynamic and flexible setup.

> 
> I don't think this is sufficiently well established to warrant new uAPI.
> You can use a UUID and pass it via ndo_get_phys_port_id.

phys_port_id only applies to netdev interfaces, whereas this use case is
broader and more aligned with devlink. We believe devlink is a more
appropriate place for this functionality.

Mark