[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zi-Epjj3eiznjEyQ@nanopsycho>
Date: Mon, 29 Apr 2024 13:29:42 +0200
From: Jiri Pirko <jiri@...nulli.us>
To: Shane Miller <gshanemiller6@...il.com>
Cc: netdev@...r.kernel.org
Subject: Re: SR-IOV + switchdev + vlan + Mellanox: Cannot ping
Sun, Apr 28, 2024 at 10:24:14PM CEST, gshanemiller6@...il.com wrote:
>J Pirko wrote,
>
>"You have to configure forwarding between appropriate representors. Use
>ovs (probably easiest) or tc."
>
>Thank you for taking time to reply. But I need additional information/guidance
> on how to bridge and what to bridge.
>
>TC can be used to mirror packets for example and in fact, I have set that up,
>which is why I need the NIC in switchdev mode. However, this is orthogonal.
>As I say in the original post, leaving the NIC in "legacy" mode has no ping
>issues. As far as I understand it TC is not part of the solution space here.
>
>My vague understanding is putting a NIC into switchdev mode means packets
>flow into HW only not passing through the kernel, and this is what screws ARP
Nope. Think of it as another switch inside the NIC that connects VFs and
uplink port. You have representors that represent the switch port. Each
representor has counter part VF. You have to configure the forwarding
between the representor, similar to switch ports. In switch, there is
also no default forwarding.
>up since the kernel is needed at bit. A bridge is supposed to fix that. I tried,
>
>brctl addbr sriovbr
>brctl addif sriovbr <DEV>
>ip link set dev sriovbr up
>ip addr ... sriov ...
I don't think that bridge offload is supported, I may be wrong.
>
>where <DEV> was the link name of the physical device, or the virtual link, or
>the port representor, or combo to no effect.
>
>So, restating the issue: A NIC is SR-IOV virtualized into 4 virt NICs each with
>a vlan, IP address. The NIC is placed into switchdev mode. The virtual NICs
>are not pingable from other boxes. The other boxes see the NIC's MAC
>addresses as incomplete (arp -n or arp -e).
>
>What and how do I bridge/link to fix this problem?
>
>On Sat, Apr 27, 2024 at 6:26 AM Jiri Pirko <jiri@...nulli.us> wrote:
>>
>> Fri, Apr 26, 2024 at 10:35:28PM CEST, gshanemiller6@...il.com wrote:
>> >Problem:
>> >-----------------------------------------------------------------
>> >root@...hA $ ping 10.xx.xx.194
>> >PING 10.xx.xx.194 (10.xx.xx.194) 56(84) bytes of data
>> >From 10.xx.xx.191 icmp seq=10 Destination Host Unreachable
>> >Proximate Cause:
>> >-----------------------------------------------------------------
>> >This seems to be a side effect of "switchdev" mode. When the identical
>> >configuration is set up EXCEPT that the SR-IOV virtualized NIC is left
>> >"legacy", ping (and ncat) works just fine.
>> >
>> >As far as I can tell I need a bridge or bridge commands, but I have no
>> >idea where to start. This environment will not allow me to add modify
>> >commands when enabling switchdev mode. devlink seems to accept
>> >"switchdev" alone without modifiers.
>>
>> You have to configure forwarding between appropriate representors. Use
>> ovs (probably easiest) or tc.
>>
>> >
>> >Note: putting a NIC into switchdev mode makes the virtual functions
>> >show as "link-state disable" which is confusing. (See below.) Contrary
>> >to what it seems to suggest, the virtual NICs are up and running
>> >
>> >Running "arp -e" on machine A shows machine B's ieth3v0 MAC address as
>> >incomplete suggesting switchdev+ARP is broken.
>> >
>> >Problem Environment:
>> >-----------------------------------------------------------------
>> >OS: RHEL 8.6 4.18.0-372.46.1.el8 x64
>> >NICs: Mellanox ConnectX-6
>> >
>> >Machine A Links:
>> >70 tst@...h3: <...LOWER_UP...> mtu 1500
>> > link/ether xx.xx.xx.xx.xx.xx
>> > vlan protocol 802.1Q id 133 <REORDER_HDR>
>> > Inet 10.xx.xx.191
>> >
>> >Machine B Links With ieth3 in SR-IOV mode in switchdev mode:
>> ># Physical Function and its virtual functions:
>> > 2: ieth3:
>> ><...PROMISC,UP,LOWER_UP> mtu 1500
>> > link/ether xx.xx.xx.xx.xx.f6 portname p0 switchid xxxxe988
>> > vf 0 link/ether xx.xx.xx.xx.xx.00 vlan 133 spoof off, link-state
>> >disable, trust off
>> > . . .
>> ># Port representers
>> >893: ieth3r0: <...UP,LOWER_UP> mtu 1500
>> >link/ether xx.xx.xx.xx.xx.e1 portname pf0vf0 switchid xxxxe988
>> >. . .
>> ># Virtual Links
>> >897: ieth3v0: <...UP,LOWER_UP> mtu 1500
>> > link/ether xx.xx.xx.xx.xx.00 promiscuity 0
>> > inet 10.xx.xx.194/24 scope global ieth3v0
>> > . . .
>> >
>> >SR-IOV Setup Summary
>> >-----------------------------------------------------------------
>> >This is done right since, in legacy mode, ping/ncat works fine:
>> >
>> >1. Enable IOMMU, Vtx in BIOS
>> >2. Boot Linux with iommu=on on command line
>> >3. Install Mellanox OFED
>> >4. Enable SR-IOV for max 8 devices in Mellanox firmware
>> >(reboot)
>> >5. Create 4 virtual NICs w/ SR-IOV
>> >6. Configure 4 virtual NICs mac, trust off, spoofchk off, state auto
>> >7. Unbind virtual NICs
>> >8. Put ieth3 into switchdev mode
>> >9. Rebind virtual NICs
>> >10. Bring all links up
>> >11. Assign IPV4 addresses to virtual links
>> >
Powered by blists - more mailing lists