[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20211116072148.2044042-1-vasudev@copyninja.info>
Date:   Tue, 16 Nov 2021 12:51:48 +0530
From:   Vasudev Kamath <vasudev@...yninja.info>
To:     netdev@...r.kernel.org
Cc:     "David S . Miller" <davem@...emloft.net>,
        Sridhar Samudrala <sridhar.samudrala@...el.com>,
        Krishna Kumar <krikku@...il.com>,
        Vasudev Kamath <vasudev@...yninja.info>
Subject: [PATCH net-next] Documentation: networking: net_failover: Fix documentation
Update net_failover documentation with missing and incomplete
details to get a proper working setup.
Signed-off-by: Vasudev Kamath <vasudev@...yninja.info>
Reviewed-by: Krishna Kumar <krikku@...il.com>
---
 Documentation/networking/net_failover.rst | 111 +++++++++++++++++-----
 1 file changed, 88 insertions(+), 23 deletions(-)
diff --git a/Documentation/networking/net_failover.rst b/Documentation/networking/net_failover.rst
index e143ab79a960..3a662f2b4d6e 100644
--- a/Documentation/networking/net_failover.rst
+++ b/Documentation/networking/net_failover.rst
@@ -35,7 +35,7 @@ To support this, the hypervisor needs to enable VIRTIO_NET_F_STANDBY
 feature on the virtio-net interface and assign the same MAC address to both
 virtio-net and VF interfaces.
 
-Here is an example XML snippet that shows such configuration.
+Here is an example libvirt XML snippet that shows such configuration:
 ::
 
   <interface type='network'>
@@ -45,18 +45,32 @@ Here is an example XML snippet that shows such configuration.
     <model type='virtio'/>
     <driver name='vhost' queues='4'/>
     <link state='down'/>
-    <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
+    <teaming type='persistent'/>
+    <alias name='ua-backup0'/>
   </interface>
   <interface type='hostdev' managed='yes'>
     <mac address='52:54:00:00:12:53'/>
     <source>
       <address type='pci' domain='0x0000' bus='0x42' slot='0x02' function='0x5'/>
     </source>
-    <address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/>
+    <teaming type='transient' persistent='ua-backup0'/>
   </interface>
 
+In this configuration, the first device definition is for the virtio-net
+interface and this acts as the 'persistent' device indicating that this
+interface will always be plugged in. This is specified by the 'teaming' tag with
+required attribute type having value 'persistent'. The link state for the
+virtio-net device is set to 'down' to ensure that the 'failover' netdev prefers
+the VF passthrough device for normal communication. The virtio-net device will
+be brought UP during live migration to allow uninterrupted communication.
+
+The second device definition is for the VF passthrough interface. Here the
+'teaming' tag is provided with type 'transient' indicating that this device may
+periodically be unplugged. A second attribute - 'persistent' is provided and
+points to the alias name declared for the virtio-net device.
+
 Booting a VM with the above configuration will result in the following 3
-netdevs created in the VM.
+interfaces created in the VM:
 ::
 
   4: ens10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
@@ -65,13 +79,36 @@ netdevs created in the VM.
          valid_lft 42482sec preferred_lft 42482sec
       inet6 fe80::97d8:db2:8c10:b6d6/64 scope link
          valid_lft forever preferred_lft forever
-  5: ens10nsby: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master ens10 state UP group default qlen 1000
+  5: ens10nsby: <BROADCAST,MULTICAST> mtu 1500 qdisc fq_codel master ens10 state DOWN group default qlen 1000
       link/ether 52:54:00:00:12:53 brd ff:ff:ff:ff:ff:ff
   7: ens11: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ens10 state UP group default qlen 1000
       link/ether 52:54:00:00:12:53 brd ff:ff:ff:ff:ff:ff
 
-ens10 is the 'failover' master netdev, ens10nsby and ens11 are the slave
-'standby' and 'primary' netdevs respectively.
+Here, ens10 is the 'failover' master interface, ens10nsby is the slave 'standby'
+virtio-net interface, and ens11 is the slave 'primary' VF passthrough interface.
+
+One point to note here is that some user space network configuration daemons
+like systemd-networkd, ifupdown, etc, do not understand the 'net_failover'
+device; and on the first boot, the VM might end up with both 'failover' device
+and VF accquiring IP addresses (either same or different) from the DHCP server.
+This will result in lack of connectivity to the VM. So some tweaks might be
+needed to these network configuration daemons to make sure that an IP is
+received only on the 'failover' device.
+
+Below is the patch snippet used with 'cloud-ifupdown-helper' script found on
+Debian cloud images:
+
+::
+  @@ -27,6 +27,8 @@ do_setup() {
+       local working="$cfgdir/.$INTERFACE"
+       local final="$cfgdir/$INTERFACE"
+
+  +    if [ -d "/sys/class/net/${INTERFACE}/master" ]; then exit 0; fi
+  +
+       if ifup --no-act "$INTERFACE" > /dev/null 2>&1; then
+           # interface is already known to ifupdown, no need to generate cfg
+           log "Skipping configuration generation for $INTERFACE"
+
 
 Live Migration of a VM with SR-IOV VF & virtio-net in STANDBY mode
 ==================================================================
@@ -80,40 +117,68 @@ net_failover also enables hypervisor controlled live migration to be supported
 with VMs that have direct attached SR-IOV VF devices by automatic failover to
 the paravirtual datapath when the VF is unplugged.
 
-Here is a sample script that shows the steps to initiate live migration on
-the source hypervisor.
+Here is a sample script that shows the steps to initiate live migration from
+the source hypervisor. Note: It is assumed that the VM is connected to a
+software bridge 'br0' which has a single VF attached to it along with the vnet
+device to the VM. This is not the VF that was passthrough'd to the VM (seen in
+the vf.xml file).
 ::
 
-  # cat vf_xml
+  # cat vf.xml
   <interface type='hostdev' managed='yes'>
     <mac address='52:54:00:00:12:53'/>
     <source>
       <address type='pci' domain='0x0000' bus='0x42' slot='0x02' function='0x5'/>
     </source>
-    <address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/>
+    <teaming type='transient' persistent='ua-backup0'/>
   </interface>
 
-  # Source Hypervisor
+  # Source Hypervisor migrate.sh
   #!/bin/bash
 
-  DOMAIN=fedora27-tap01
-  PF=enp66s0f0
-  VF_NUM=5
-  TAP_IF=tap01
-  VF_XML=
+  DOMAIN=vm-01
+  PF=ens6np0
+  VF=ens6v1             # VF attached to the bridge.
+  VF_NUM=1
+  TAP_IF=vmtap01        # virtio-net interface in the VM.
+  VF_XML=vf.xml
 
   MAC=52:54:00:00:12:53
   ZERO_MAC=00:00:00:00:00:00
 
+  # Set the virtio-net interface up.
   virsh domif-setlink $DOMAIN $TAP_IF up
-  bridge fdb del $MAC dev $PF master
-  virsh detach-device $DOMAIN $VF_XML
+
+  # Remove the VF that was passthrough'd to the VM.
+  virsh detach-device --live --config $DOMAIN $VF_XML
+
   ip link set $PF vf $VF_NUM mac $ZERO_MAC
 
-  virsh migrate --live $DOMAIN qemu+ssh://$REMOTE_HOST/system
+  # Add FDB entry for traffic to continue going to the VM via
+  # the VF -> br0 -> vnet interface path.
+  bridge fdb add $MAC dev $VF
+  bridge fdb add $MAC dev $TAP_IF master
+
+  # Migrate the VM
+  virsh migrate --live --persistent $DOMAIN qemu+ssh://$REMOTE_HOST/system
+
+  # Clean up FDB entries after migration completes.
+  bridge fdb del $MAC dev $VF
+  bridge fdb del $MAC dev $TAP_IF master
 
-  # Destination Hypervisor
+On the destination hypervisor, a shared bridge 'br0' is created before migration
+starts, and a VF from the destination PF is added to the bridge. Similarly an
+appropriate FDB entry is added.
+
+The following script is executed on the destination hypervisor once migration
+completes, and it reattaches the VF to the VM and brings down the virtio-net
+interface.
+
+::
+  # reattach-vf.sh
   #!/bin/bash
 
-  virsh attach-device $DOMAIN $VF_XML
-  virsh domif-setlink $DOMAIN $TAP_IF down
+  bridge fdb del 52:54:00:00:12:53 dev ens36v0
+  bridge fdb del 52:54:00:00:12:53 dev vmtap01 master
+  virsh attach-device --config --live vm01 vf.xml
+  virsh domif-setlink vm01 vmtap01 down
-- 
2.33.1
Powered by blists - more mailing lists
 
