[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAFtQo5BxQR56e5PNFQoRXNHOfssPZNdTDMEFpHFVS07FPpKCKg@mail.gmail.com>
Date: Sun, 28 Apr 2024 16:24:14 -0400
From: Shane Miller <gshanemiller6@...il.com>
To: Jiri Pirko <jiri@...nulli.us>
Cc: netdev@...r.kernel.org
Subject: Re: SR-IOV + switchdev + vlan + Mellanox: Cannot ping
J Pirko wrote,
"You have to configure forwarding between appropriate representors. Use
ovs (probably easiest) or tc."
Thank you for taking time to reply. But I need additional information/guidance
on how to bridge and what to bridge.
TC can be used to mirror packets for example and in fact, I have set that up,
which is why I need the NIC in switchdev mode. However, this is orthogonal.
As I say in the original post, leaving the NIC in "legacy" mode has no ping
issues. As far as I understand it TC is not part of the solution space here.
My vague understanding is putting a NIC into switchdev mode means packets
flow into HW only not passing through the kernel, and this is what screws ARP
up since the kernel is needed at bit. A bridge is supposed to fix that. I tried,
brctl addbr sriovbr
brctl addif sriovbr <DEV>
ip link set dev sriovbr up
ip addr ... sriov ...
where <DEV> was the link name of the physical device, or the virtual link, or
the port representor, or combo to no effect.
So, restating the issue: A NIC is SR-IOV virtualized into 4 virt NICs each with
a vlan, IP address. The NIC is placed into switchdev mode. The virtual NICs
are not pingable from other boxes. The other boxes see the NIC's MAC
addresses as incomplete (arp -n or arp -e).
What and how do I bridge/link to fix this problem?
On Sat, Apr 27, 2024 at 6:26 AM Jiri Pirko <jiri@...nulli.us> wrote:
>
> Fri, Apr 26, 2024 at 10:35:28PM CEST, gshanemiller6@...il.com wrote:
> >Problem:
> >-----------------------------------------------------------------
> >root@...hA $ ping 10.xx.xx.194
> >PING 10.xx.xx.194 (10.xx.xx.194) 56(84) bytes of data
> >From 10.xx.xx.191 icmp seq=10 Destination Host Unreachable
> >Proximate Cause:
> >-----------------------------------------------------------------
> >This seems to be a side effect of "switchdev" mode. When the identical
> >configuration is set up EXCEPT that the SR-IOV virtualized NIC is left
> >"legacy", ping (and ncat) works just fine.
> >
> >As far as I can tell I need a bridge or bridge commands, but I have no
> >idea where to start. This environment will not allow me to add modify
> >commands when enabling switchdev mode. devlink seems to accept
> >"switchdev" alone without modifiers.
>
> You have to configure forwarding between appropriate representors. Use
> ovs (probably easiest) or tc.
>
> >
> >Note: putting a NIC into switchdev mode makes the virtual functions
> >show as "link-state disable" which is confusing. (See below.) Contrary
> >to what it seems to suggest, the virtual NICs are up and running
> >
> >Running "arp -e" on machine A shows machine B's ieth3v0 MAC address as
> >incomplete suggesting switchdev+ARP is broken.
> >
> >Problem Environment:
> >-----------------------------------------------------------------
> >OS: RHEL 8.6 4.18.0-372.46.1.el8 x64
> >NICs: Mellanox ConnectX-6
> >
> >Machine A Links:
> >70 tst@...h3: <...LOWER_UP...> mtu 1500
> > link/ether xx.xx.xx.xx.xx.xx
> > vlan protocol 802.1Q id 133 <REORDER_HDR>
> > Inet 10.xx.xx.191
> >
> >Machine B Links With ieth3 in SR-IOV mode in switchdev mode:
> ># Physical Function and its virtual functions:
> > 2: ieth3:
> ><...PROMISC,UP,LOWER_UP> mtu 1500
> > link/ether xx.xx.xx.xx.xx.f6 portname p0 switchid xxxxe988
> > vf 0 link/ether xx.xx.xx.xx.xx.00 vlan 133 spoof off, link-state
> >disable, trust off
> > . . .
> ># Port representers
> >893: ieth3r0: <...UP,LOWER_UP> mtu 1500
> >link/ether xx.xx.xx.xx.xx.e1 portname pf0vf0 switchid xxxxe988
> >. . .
> ># Virtual Links
> >897: ieth3v0: <...UP,LOWER_UP> mtu 1500
> > link/ether xx.xx.xx.xx.xx.00 promiscuity 0
> > inet 10.xx.xx.194/24 scope global ieth3v0
> > . . .
> >
> >SR-IOV Setup Summary
> >-----------------------------------------------------------------
> >This is done right since, in legacy mode, ping/ncat works fine:
> >
> >1. Enable IOMMU, Vtx in BIOS
> >2. Boot Linux with iommu=on on command line
> >3. Install Mellanox OFED
> >4. Enable SR-IOV for max 8 devices in Mellanox firmware
> >(reboot)
> >5. Create 4 virtual NICs w/ SR-IOV
> >6. Configure 4 virtual NICs mac, trust off, spoofchk off, state auto
> >7. Unbind virtual NICs
> >8. Put ieth3 into switchdev mode
> >9. Rebind virtual NICs
> >10. Bring all links up
> >11. Assign IPV4 addresses to virtual links
> >
Powered by blists - more mailing lists