lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <03d496b0-84c2-b3ca-5be5-d4540c6d8ec7@arm.com>
Date:   Thu, 13 Sep 2018 16:03:01 +0100
From:   Jean-Philippe Brucker <jean-philippe.brucker@....com>
To:     "Tian, Kevin" <kevin.tian@...el.com>,
        Lu Baolu <baolu.lu@...ux.intel.com>,
        Joerg Roedel <joro@...tes.org>,
        David Woodhouse <dwmw2@...radead.org>,
        Alex Williamson <alex.williamson@...hat.com>,
        Kirti Wankhede <kwankhede@...dia.com>
Cc:     "Raj, Ashok" <ashok.raj@...el.com>,
        "Bie, Tiwei" <tiwei.bie@...el.com>,
        "Kumar, Sanjay K" <sanjay.k.kumar@...el.com>,
        "iommu@...ts.linux-foundation.org" <iommu@...ts.linux-foundation.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "Sun, Yi Y" <yi.y.sun@...el.com>,
        "Pan, Jacob jun" <jacob.jun.pan@...el.com>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>
Subject: Re: [RFC PATCH v2 00/10] vfio/mdev: IOMMU aware mediated device

On 13/09/2018 01:19, Tian, Kevin wrote:
>>> This is proposed for architectures which support finer granularity
>>> second level translation with no impact on architectures which only
>>> support Source ID or the similar granularity.
>>
>> Just to be clear, in this paragraph you're only referring to the
>> Nested/second-level translation for mdev, which is specific to vt-d
>> rev3? Other architectures can still do first-level translation with
>> PASID, to support some use-cases of IOMMU aware mediated device
>> (assigning mdevs to userspace drivers, for example)
> 
> yes. aux domain concept applies only to vt-d rev3 which introduces
> scalable mode. Care is taken to avoid breaking usages on existing
> architectures.
> 
> one note. Assigning mdevs to user space alone doesn't imply IOMMU
> aware. All existing mdev usages use software or proprietary methods to
> isolate DMA. There is only one potential IOMMU aware mdev usage 
> which we talked not rely on vt-d rev3 scalable mode - wrap a random 
> PCI device into a single mdev instance (no sharing). In that case mdev 
> inherits RID from parent PCI device, thus is isolated by IOMMU in RID 
> granular. Our RFC supports this usage too. In VFIO two usages (PASID-
> based and RID-based) use same code path, i.e. always binding domain to
> the parent device of mdev. But within IOMMU they go different paths.
> PASID-based will go to aux-domain as iommu_enable_aux_domain
> has been called on that device. RID-based will follow existing 
> unmanaged domain path, as if it is parent device assignment.

For Arm SMMU we're more interested in the PASID-granular case than the
RID-granular one. It doesn't necessarily require vt-d rev3 scalable
mode, the following example can be implemented with an SMMUv3, since it
only needs PASID-granular first-level translation:

We have a PCI function that supports PASID, and can be partitioned into
multiple isolated entities, mdevs. Each mdev has an MMIO frame, an MSI
vector and a PASID.

Different processes (userspace drivers, not QEMU) each open one mdev. A
process controlling one mdev has two ways of doing DMA:

(1) Classically, the process uses a VFIO_TYPE1v2_IOMMU container. This
creates an auxiliary domain for the mdev, with PASID #35. The process
creates DMA mappings with VFIO_IOMMU_MAP_DMA. VFIO calls iommu_map on
the auxiliary domain. The IOMMU driver populates the pgtables associated
with PASID #35.

(2) SVA. One way of doing it: the process uses a new
"VFIO_TYPE1_SVA_IOMMU" type of container. VFIO binds the process address
space to the device, gets PASID #35. Simpler, but not everyone wants to
use SVA, especially not userspace drivers which need the highest
performance.


This example only needs to modify first-level translation, and works
with SMMUv3. The kernel here could be the host, in which case
second-level translation is disabled in the SMMU, or it could be the
guest, in which case second-level mappings are created by QEMU and
first-level translation is managed by assigning PASID tables to the guest.

So (2) would use iommu_sva_bind_device(), but (1) needs something else.
Aren't auxiliary domains suitable for (1)? Why limit auxiliary domain to
second-level or nested translation? It seems silly to use a different
API for first-level, since the flow in userspace and VFIO is the same as
your second-level case as far as MAP_DMA ioctl goes. The difference is
that in your case the auxiliary domain supports an additional operation
which binds first-level page tables. An auxiliary domain that only
supports first-level wouldn't support this operation, but it can still
implement iommu_map/unmap/etc.


Another note: if for some reason you did want to allow userspace to
choose between first-level or second-level, you could implement the
VFIO_TYPE1_NESTING_IOMMU container. It acts like a VFIO_TYPE1v2_IOMMU,
but also sets the DOMAIN_ATTR_NESTING on the IOMMU domain. So DMA_MAP
ioctl on a NESTING container would populate second-level, and DMA_MAP on
a normal container populates first-level. But if you're always going to
use second-level by default, the distinction isn't necessary.


>> Sounds good, I'll drop the private PASID patch if we can figure out a
>> solution to the attach/detach_dev problem discussed on patch 8/10
>>
> 
> Can you elaborate a bit on private PASID usage? what is the
> high level flow on it? 
> 
> Again based on earlier explanation, aux domain is specific to IOMMU
> architecture supporting vtd scalable mode-like capability, which allows
> separate 2nd/1st level translations per PASID. Need a better understanding
> how private PASID is relevant here.

Private PASIDs are used for doing iommu_map/iommu_unmap on PASIDs
(first-level translation):
https://www.spinics.net/lists/dri-devel/msg177003.html As above, some
people don't want SVA, some can't do it, some may even want a few
private address spaces just for their kernel driver. They need a way to
allocate PASIDs and do iommu_map/iommu_unmap on them, without binding to
a process. I was planning to add the private PASID patch to my SVA
series, but in my opinion the feature overlaps with auxiliary domains.

Thanks,
Jean

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ