[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ygnho8gtgw2l.fsf@nvidia.com>
Date: Tue, 9 Feb 2021 16:22:26 +0200
From: Vlad Buslov <vladbu@...dia.com>
To: Jakub Kicinski <kuba@...nel.org>
CC: Marcelo Ricardo Leitner <marcelo.leitner@...il.com>,
Saeed Mahameed <saeed@...nel.org>,
"David S. Miller" <davem@...emloft.net>, <netdev@...r.kernel.org>,
Mark Bloch <mbloch@...dia.com>,
Saeed Mahameed <saeedm@...dia.com>
Subject: Re: [net-next V2 01/17] net/mlx5: E-Switch, Refactor setting source
port
On Mon 08 Feb 2021 at 22:22, Jakub Kicinski <kuba@...nel.org> wrote:
> On Mon, 8 Feb 2021 10:21:21 +0200 Vlad Buslov wrote:
>> > These operations imply that 7.7.7.5 is configured on some interface on
>> > the host. Most likely the VF representor itself, as that aids with ARP
>> > resolution. Is that so?
>>
>> Hi Marcelo,
>>
>> The tunnel endpoint IP address is configured on VF that is represented
>> by enp8s0f0_0 representor in example rules. The VF is on host.
>
> This is very confusing, are you saying that the 7.7.7.5 is configured
> both on VF and VFrep? Could you provide a full picture of the config
> with IP addresses and routing?
Hi Jakub,
No, tunnel IP is configured on VF. That particular VF is in host
namespace. When mlx5 resolves tunneling the code checks if tunnel
endpoint IP address is on such mlx5 VF, since the VF is in same
namespace as eswitch manager (e.g. on host) and route returned by
ip_route_output_key() is resolved through rt->dst.dev==tunVF device.
After establishing that tunnel is on VF the goal is to process two
resulting TC rules (in both directions) fully in hardware without
exposing the packet on tunneling device or tunnel VF in sw, which is
implemented with all the infrastructure from this series.
So, to summarize with IP addresses from TC examples presented in cover letter,
we have underlay network 7.7.7.0/24 in host namespace with tunnel endpoint IP
address on VF:
$ ip a show dev enp8s0f0v0
1537: enp8s0f0v0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 52:e5:6d:f2:00:69 brd ff:ff:ff:ff:ff:ff
altname enp8s0f0np0v0
inet 7.7.7.5/24 scope global enp8s0f0v0
valid_lft forever preferred_lft forever
inet6 fe80::50e5:6dff:fef2:69/64 scope link
valid_lft forever preferred_lft forever
Like all VFs in switchdev model the tunnel VF is controlled through representor
that doesn't have any IP address assigned:
$ ip a show dev enp8s0f0_0
1534: enp8s0f0_0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP group default qlen 1000
link/ether 96:98:b1:59:aa:5e brd ff:ff:ff:ff:ff:ff
altname enp8s0f0npf0vf0
inet6 fe80::9498:b1ff:fe59:aa5e/64 scope link
valid_lft forever preferred_lft forever
User VFs have IP addresses from overlay network (5.5.5.0/24 in my tests) and are
in namespaces/VMs, while only their representors are on host attached to same
v-switch bridge with tunnel VF represetor:
$ sudo ip netns exec ns0 ip a show dev enp8s0f0v1
1538: enp8s0f0v1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 9e:cf:b5:69:84:d1 brd ff:ff:ff:ff:ff:ff
altname enp8s0f0np0v1
inet 5.5.5.5/24 scope global enp8s0f0v1
valid_lft forever preferred_lft forever
inet6 fe80::9ccf:b5ff:fe69:84d1/64 scope link
valid_lft forever preferred_lft forever
$ ip a show dev enp8s0f0_1
1535: enp8s0f0_1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ovs-system state UP group default qlen 1000
link/ether 06:96:1e:23:df:a4 brd ff:ff:ff:ff:ff:ff
altname enp8s0f0npf0vf1
OVS bridge ports:
$ sudo ovs-vsctl list-ports ovs-br
enp8s0f0
enp8s0f0_0
enp8s0f0_1
enp8s0f0_2
vxlan0
The TC rules from cover letter are installed by OVS configured according to
description above when running iperf traffic from namespaced VF enp8s0f0v1 to
another machine connected over uplink port:
$ sudo ip netns exec ns0 iperf3 -c 5.5.5.1 -t 10000
Connecting to host 5.5.5.1, port 5201
[ 5] local 5.5.5.5 port 34486 connected to 5.5.5.1 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 158 MBytes 1.32 Gbits/sec 41 771 KBytes
Hope this clarifies things and sorry for confusion!
Regards,
Vlad
Powered by blists - more mailing lists