lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1720790333-456232-1-git-send-email-steven.sistare@oracle.com>
Date: Fri, 12 Jul 2024 06:18:46 -0700
From: Steve Sistare <steven.sistare@...cle.com>
To: virtualization@...ts.linux-foundation.org, linux-kernel@...r.kernel.org
Cc: "Michael S. Tsirkin" <mst@...hat.com>, Jason Wang <jasowang@...hat.com>,
        Si-Wei Liu <si-wei.liu@...cle.com>,
        Eugenio Perez Martin <eperezma@...hat.com>,
        Xuan Zhuo <xuanzhuo@...ux.alibaba.com>,
        Dragos Tatulea <dtatulea@...dia.com>,
        Steve Sistare <steven.sistare@...cle.com>
Subject: [PATCH V2 0/7] vdpa live update

Live update is a technique wherein an application saves its state, exec's
to an updated version of itself, and restores its state.  Clients of the
application experience a brief suspension of service, on the order of
100's of milliseconds, but are otherwise unaffected.

Define and implement interfaces that allow vdpa devices to be preserved
across fork or exec, to support live update for applications such as QEMU.
The device must be suspended during the update, but its DMA mappings are
preserved, so the suspension is brief.

The VHOST_NEW_OWNER ioctl transfers device ownership and pinned memory
accounting from one process to another.

The VHOST_BACKEND_F_NEW_OWNER backend capability indicates that
VHOST_NEW_OWNER is supported.

The VHOST_IOTLB_REMAP message type updates a DMA mapping with its userland
address in the new process.

The VHOST_BACKEND_F_IOTLB_REMAP backend capability indicates that
VHOST_IOTLB_REMAP is supported and required.  Some devices do not
require it, because the userland address of each DMA mapping is discarded
after being translated to a physical address.

Here is a pseudo-code sequence for performing live update, based on
suspend + reset because resume is not yet widely available.  The vdpa device
descriptor, fd, remains open across the exec.

  ioctl(fd, VHOST_VDPA_SUSPEND)
  ioctl(fd, VHOST_VDPA_SET_STATUS, 0)
  exec

  ioctl(fd, VHOST_NEW_OWNER)

  issue ioctls to re-create vrings

  if VHOST_BACKEND_F_IOTLB_REMAP
      foreach dma mapping
          write(fd, {VHOST_IOTLB_REMAP, new_addr})

  ioctl(fd, VHOST_VDPA_SET_STATUS,
            ACKNOWLEDGE | DRIVER | FEATURES_OK | DRIVER_OK)

This is faster than VHOST_RESET_OWNER + VHOST_SET_OWNER + VHOST_IOTLB_UPDATE,
as that would would unpin and repin physical pages, which would cost multiple
seconds for large memories.

This is implemented in QEMU by the patch series "Live update: vdpa"
  https://lore.kernel.org/qemu-devel/TBD  (reference to be posted shortly)

The QEMU implementation leverages the live migration code path, but after
CPR exec's new QEMU:
  - vhost_vdpa_set_owner() calls VHOST_NEW_OWNER instead of VHOST_SET_OWNER
  - vhost_vdpa_dma_map() sets type VHOST_IOTLB_REMAP instead of
    VHOST_IOTLB_UPDATE

Changes in V2:
  - clean up handling of set_map vs dma_map vs platform iommu in remap
  - augment and clarify commit messages and comments

Steve Sistare (7):
  vhost-vdpa: count pinned memory
  vhost-vdpa: pass mm to bind
  vhost-vdpa: VHOST_NEW_OWNER
  vhost-vdpa: VHOST_BACKEND_F_NEW_OWNER
  vhost-vdpa: VHOST_IOTLB_REMAP
  vhost-vdpa: VHOST_BACKEND_F_IOTLB_REMAP
  vdpa/mlx5: new owner capability

 drivers/vdpa/mlx5/net/mlx5_vnet.c |   3 +-
 drivers/vhost/vdpa.c              | 125 ++++++++++++++++++++++++++++--
 drivers/vhost/vhost.c             |  15 ++++
 drivers/vhost/vhost.h             |   1 +
 include/uapi/linux/vhost.h        |  10 +++
 include/uapi/linux/vhost_types.h  |  15 +++-
 6 files changed, 161 insertions(+), 8 deletions(-)

-- 
2.39.3


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ