lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN9PR11MB54330217565C687A7275AE458C859@BN9PR11MB5433.namprd11.prod.outlook.com>
Date:   Wed, 27 Oct 2021 02:32:57 +0000
From:   "Tian, Kevin" <kevin.tian@...el.com>
To:     David Gibson <david@...son.dropbear.id.au>
CC:     "Liu, Yi L" <yi.l.liu@...el.com>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "jasowang@...hat.com" <jasowang@...hat.com>,
        "kwankhede@...dia.com" <kwankhede@...dia.com>,
        "hch@....de" <hch@....de>,
        "jean-philippe@...aro.org" <jean-philippe@...aro.org>,
        "Jiang, Dave" <dave.jiang@...el.com>,
        "Raj, Ashok" <ashok.raj@...el.com>,
        "corbet@....net" <corbet@....net>,
        "jgg@...dia.com" <jgg@...dia.com>,
        "parav@...lanox.com" <parav@...lanox.com>,
        "alex.williamson@...hat.com" <alex.williamson@...hat.com>,
        "lkml@...ux.net" <lkml@...ux.net>,
        "dwmw2@...radead.org" <dwmw2@...radead.org>,
        "Tian, Jun J" <jun.j.tian@...el.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "lushenming@...wei.com" <lushenming@...wei.com>,
        "iommu@...ts.linux-foundation.org" <iommu@...ts.linux-foundation.org>,
        "pbonzini@...hat.com" <pbonzini@...hat.com>,
        "robin.murphy@....com" <robin.murphy@....com>
Subject: RE: [RFC 11/20] iommu/iommufd: Add IOMMU_IOASID_ALLOC/FREE

> From: David Gibson <david@...son.dropbear.id.au>
> Sent: Monday, October 25, 2021 1:05 PM
> 
> > > > For above cases a [base, max] hint can be provided by the user per
> > > > Jason's recommendation.
> > >
> > > Provided at which stage?
> >
> > IOMMU_IOASID_ALLOC
> 
> Ok.  I have mixed thoughts on this.  Doing this at ALLOC time was my
> first instict as well.  However with Jason's suggestion that any of a
> number of things could disambiguate multiple IOAS attached to a
> device, I wonder if it makes more sense for consistency to put base
> address at attach time, as with PASID.

In that case the base address provided at attach time is used as an 
address space ID similar to PASID, which imho is orthogonal to the 
generic [base, size] info for IOAS itself. The 2nd base sort of becomes 
an offset on top of the first base in ppc case.

> >
> > regarding live migration with vfio devices, it's still in early stage. there
> > are tons of compatibility check opens to be addressed before it can
> > be widely deployed. this might just add another annoying open to that
> > long list...
> 
> So, yes, live migration with VFIO is limited, unfortunately this
> still affects us even if we don't (currently) have VFIO devices.  The
> problem arises from the combination of two limitations:
> 
> 1) Live migration means that we can't dynamically select guest visible
> IOVA parameters at qemu start up time.  We need to get consistent
> guest visible behaviour for a given set of qemu options, so that we
> can migrate between them.
> 
> 2) Device hotplug means that we don't know if a PCI domain will have
> VFIO devices on it when we start qemu.  So, we don't know if host
> limitations on IOVA ranges will affect the guest or not.
> 
> Together these mean that the best we can do is to define a *fixed*
> (per machine type) configuration based on qemu options only.  That is,
> defined by the guest platform we're trying to present, only, never
> host capabilities.  We can then see if that configuration is possible
> on the host and pass or fail.  It's never safe to go the other
> direction and take host capabilities and present those to the guest.
> 

That is just one userspace policy. We don't want to design a uAPI
just for a specific userspace implementation. In concept the 
userspace could:

1)  use DMA-API like map/unmap i.e. letting IOVA address space
    managed by the kernel;

    * suitable for simple applications e.g. dpdk.

2)  manage IOVA address space with *fixed* layout:

    * fail device passthrough at MAP_DMA if conflict is detected
      between mapped range and device specific IOVA holes

    * suitable for VM when live migration is highly concerned

    * potential problem with vIOMMU since the guest is unaware
      of host constraints thus undefined behavior may occur if
      guest IOVA addresses happens to overlap with host IOVA holes.

        * ppc is special as you need to claim guest IOVA ranges in
          the host. But it's not the case for other emulated IOMMUs.

3)  manage IOVA address space with host constraints:

    * create IOVA layout by combining qemu options and IOVA holes 
      of all boot-time passthrough devices

    * reject hotplugged device if it has conflicting IOVA holes with
      the initial IOVA layout

    * suitable for vIOMMU since host constraints can be further 
      reported to the guest

    * suitable for VM w/o live migration requirement, e.g. in many
      client virtualization scenarios

    * suboptimal with VM live migration with compatibility limitation

Overall the proposed uAPI will provide:

1)  a simple DMA-API-like mapping protocol for kernel managed IOVA
    address space:

2)  a vfio-like mapping protocol for user managed IOVA address space:

    a) check IOVA conflict in MAP_DMA ioctl;
    b) allows the user to query available IOVA ranges;

Then it's totally user policy on how it wants to utilize those ioctls.

Thanks
Kevin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ