lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Tue, 10 Aug 2021 09:17:10 +0200 From: Eric Auger <eric.auger@...hat.com> To: "Tian, Kevin" <kevin.tian@...el.com>, Jason Gunthorpe <jgg@...dia.com>, "Alex Williamson (alex.williamson@...hat.com)" <alex.williamson@...hat.com>, Jean-Philippe Brucker <jean-philippe@...aro.org>, David Gibson <david@...son.dropbear.id.au>, Jason Wang <jasowang@...hat.com>, "parav@...lanox.com" <parav@...lanox.com>, "Enrico Weigelt, metux IT consult" <lkml@...ux.net>, Paolo Bonzini <pbonzini@...hat.com>, Shenming Lu <lushenming@...wei.com>, Joerg Roedel <joro@...tes.org> Cc: Jonathan Corbet <corbet@....net>, "Raj, Ashok" <ashok.raj@...el.com>, "Liu, Yi L" <yi.l.liu@...el.com>, "Wu, Hao" <hao.wu@...el.com>, "Jiang, Dave" <dave.jiang@...el.com>, Jacob Pan <jacob.jun.pan@...ux.intel.com>, Kirti Wankhede <kwankhede@...dia.com>, Robin Murphy <robin.murphy@....com>, "kvm@...r.kernel.org" <kvm@...r.kernel.org>, "iommu@...ts.linux-foundation.org" <iommu@...ts.linux-foundation.org>, David Woodhouse <dwmw2@...radead.org>, LKML <linux-kernel@...r.kernel.org>, Lu Baolu <baolu.lu@...ux.intel.com> Subject: Re: [RFC v2] /dev/iommu uAPI proposal Hi Kevin, On 8/5/21 2:36 AM, Tian, Kevin wrote: >> From: Eric Auger <eric.auger@...hat.com> >> Sent: Wednesday, August 4, 2021 11:59 PM >> > [...] >>> 1.2. Attach Device to I/O address space >>> +++++++++++++++++++++++++++++++++++++++ >>> >>> Device attach/bind is initiated through passthrough framework uAPI. >>> >>> Device attaching is allowed only after a device is successfully bound to >>> the IOMMU fd. User should provide a device cookie when binding the >>> device through VFIO uAPI. This cookie is used when the user queries >>> device capability/format, issues per-device iotlb invalidation and >>> receives per-device I/O page fault data via IOMMU fd. >>> >>> Successful binding puts the device into a security context which isolates >>> its DMA from the rest system. VFIO should not allow user to access the >> s/from the rest system/from the rest of the system >>> device before binding is completed. Similarly, VFIO should prevent the >>> user from unbinding the device before user access is withdrawn. >> With Intel scalable IOV, I understand you could assign an RID/PASID to >> one VM and another one to another VM (which is not the case for ARM). Is >> it a targetted use case?How would it be handled? Is it related to the >> sub-groups evoked hereafter? > Not related to sub-group. Each mdev is bound to the IOMMU fd respectively > with the defPASID which represents the mdev. But how does it work in term of security. The device (RID) is bound to an IOMMU fd. But then each SID/PASID may be working for a different VM. How do you detect this is safe as each SID can work safely for a different VM versus the ARM case where it is not possible. 1.3 says " 1) A successful binding call for the first device in the group creates the security context for the entire group, by: " What does it mean for above scalable IOV use case? > >> Actually all devices bound to an IOMMU fd should have the same parent >> I/O address space or root address space, am I correct? If so, maybe add >> this comment explicitly? > in most cases yes but it's not mandatory. multiple roots are allowed > (e.g. with vIOMMU but no nesting). OK, right, this corresponds to example 4.2 for example. I misinterpreted the notion of security context. The security context does not match the IOMMU fd but is something implicit created on 1st device binding. > > [...] >>> The device in the /dev/iommu context always refers to a physical one >>> (pdev) which is identifiable via RID. Physically each pdev can support >>> one default I/O address space (routed via RID) and optionally multiple >>> non-default I/O address spaces (via RID+PASID). >>> >>> The device in VFIO context is a logic concept, being either a physical >>> device (pdev) or mediated device (mdev or subdev). Each vfio device >>> is represented by RID+cookie in IOMMU fd. User is allowed to create >>> one default I/O address space (routed by vRID from user p.o.v) per >>> each vfio_device. >> The concept of default address space is not fully clear for me. I >> currently understand this is a >> root address space (not nesting). Is that coorect.This may need >> clarification. > w/o PASID there is only one address space (either GPA or GIOVA) > per device. This one is called default. whether it's root is orthogonal > (e.g. GIOVA could be also nested) to the device view of this space. > > w/ PASID additional address spaces can be targeted by the device. > those are called non-default. > > I could also rename default to RID address space and non-default to > RID+PASID address space if doing so makes it clearer. Yes I think it is worth having a kind of glossary and defining root as, default as as you clearly defined child/parent. > >>> VFIO decides the routing information for this default >>> space based on device type: >>> >>> 1) pdev, routed via RID; >>> >>> 2) mdev/subdev with IOMMU-enforced DMA isolation, routed via >>> the parent's RID plus the PASID marking this mdev; >>> >>> 3) a purely sw-mediated device (sw mdev), no routing required i.e. no >>> need to install the I/O page table in the IOMMU. sw mdev just uses >>> the metadata to assist its internal DMA isolation logic on top of >>> the parent's IOMMU page table; >> Maybe you should introduce this concept of SW mediated device earlier >> because it seems to special case the way the attach behaves. I am >> especially refering to >> >> "Successful attaching activates an I/O address space in the IOMMU, if the >> device is not purely software mediated" > makes sense. > >>> In addition, VFIO may allow user to create additional I/O address spaces >>> on a vfio_device based on the hardware capability. In such case the user >>> has its own view of the virtual routing information (vPASID) when marking >>> these non-default address spaces. >> I do not catch what does mean "marking these non default address space". > as explained above, those non-default address spaces are identified/routed > via PASID. > >>> 1.3. Group isolation >>> ++++++++++++++++++++ > [...] >>> 1) A successful binding call for the first device in the group creates >>> the security context for the entire group, by: >>> >>> * Verifying group viability in a similar way as VFIO does; >>> >>> * Calling IOMMU-API to move the group into a block-dma state, >>> which makes all devices in the group attached to an block-dma >>> domain with an empty I/O page table; >> this block-dma state/domain would deserve to be better defined (I know >> you already evoked it in 1.1 with the dma mapping protocol though) >> activates an empty I/O page table in the IOMMU (if the device is not >> purely SW mediated)? > sure. some explanations are scattered in following paragraph, but I > can consider to further clarify it. > >> How does that relate to the default address space? Is it the same? > different. this block-dma domain doesn't hold any valid mapping. The > default address space is represented by a normal unmanaged domain. > the ioasid attaching operation will detach the device from the block-dma > domain and then attach it to the target ioasid. OK Thanks Eric > >>> 2. uAPI Proposal >>> ---------------------- > [...] >>> /* >>> * Allocate an IOASID. >>> * >>> * IOASID is the FD-local software handle representing an I/O address >>> * space. Each IOASID is associated with a single I/O page table. User >>> * must call this ioctl to get an IOASID for every I/O address space that is >>> * intended to be tracked by the kernel. >>> * >>> * User needs to specify the attributes of the IOASID and associated >>> * I/O page table format information according to one or multiple devices >>> * which will be attached to this IOASID right after. The I/O page table >>> * is activated in the IOMMU when it's attached by a device. Incompatible >> .. if not SW mediated >>> * format between device and IOASID will lead to attaching failure. >>> * >>> * The root IOASID should always have a kernel-managed I/O page >>> * table for safety. Locked page accounting is also conducted on the root. >> The definition of root IOASID is not easily found in this spec. Maybe >> this would deserve some clarification. > make sense. > > and thanks for other typo-related comments. > > Thanks > Kevin
Powered by blists - more mailing lists