lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 15 Jun 2021 10:12:15 -0600
From:   Alex Williamson <alex.williamson@...hat.com>
To:     "Tian, Kevin" <kevin.tian@...el.com>
Cc:     Jason Gunthorpe <jgg@...dia.com>, Joerg Roedel <joro@...tes.org>,
        Jean-Philippe Brucker <jean-philippe@...aro.org>,
        David Gibson <david@...son.dropbear.id.au>,
        "Jason Wang" <jasowang@...hat.com>,
        "parav@...lanox.com" <parav@...lanox.com>,
        "Enrico Weigelt, metux IT consult" <lkml@...ux.net>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Shenming Lu <lushenming@...wei.com>,
        Eric Auger <eric.auger@...hat.com>,
        Jonathan Corbet <corbet@....net>,
        "Raj, Ashok" <ashok.raj@...el.com>,
        "Liu, Yi L" <yi.l.liu@...el.com>, "Wu, Hao" <hao.wu@...el.com>,
        "Jiang, Dave" <dave.jiang@...el.com>,
        Jacob Pan <jacob.jun.pan@...ux.intel.com>,
        Kirti Wankhede <kwankhede@...dia.com>,
        "Robin Murphy" <robin.murphy@....com>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "iommu@...ts.linux-foundation.org" <iommu@...ts.linux-foundation.org>,
        "David Woodhouse" <dwmw2@...radead.org>,
        LKML <linux-kernel@...r.kernel.org>,
        "Lu Baolu" <baolu.lu@...ux.intel.com>
Subject: Re: Plan for /dev/ioasid RFC v2

On Tue, 15 Jun 2021 02:31:39 +0000
"Tian, Kevin" <kevin.tian@...el.com> wrote:

> > From: Alex Williamson <alex.williamson@...hat.com>
> > Sent: Tuesday, June 15, 2021 12:28 AM
> >   
> [...]
> > > IOASID. Today the group fd requires an IOASID before it hands out a
> > > device_fd. With iommu_fd the device_fd will not allow IOCTLs until it
> > > has a blocked DMA IOASID and is successefully joined to an iommu_fd.  
> > 
> > Which is the root of my concern.  Who owns ioctls to the device fd?
> > It's my understanding this is a vfio provided file descriptor and it's
> > therefore vfio's responsibility.  A device-level IOASID interface
> > therefore requires that vfio manage the group aspect of device access.
> > AFAICT, that means that device access can therefore only begin when all
> > devices for a given group are attached to the IOASID and must halt for
> > all devices in the group if any device is ever detached from an IOASID,
> > even temporarily.  That suggests a lot more oversight of the IOASIDs by
> > vfio than I'd prefer.
> >   
> 
> This is possibly the point that is worthy of more clarification and
> alignment, as it sounds like the root of controversy here.
> 
> I feel the goal of vfio group management is more about ownership, i.e. 
> all devices within a group must be assigned to a single user. Following
> the three rules defined by Jason, what we really care is whether a group
> of devices can be isolated from the rest of the world, i.e. no access to
> memory/device outside of its security context and no access to its 
> security context from devices outside of this group. This can be achieved
> as long as every device in the group is either in block-DMA state when 
> it's not attached to any security context or attached to an IOASID context 
> in IOMMU fd.
> 
> As long as group-level isolation is satisfied, how devices within a group 
> are further managed is decided by the user (unattached, all attached to 
> same IOASID, attached to different IOASIDs) as long as the user 
> understands the implication of lacking of isolation within the group. This 
> is what a device-centric model comes to play. Misconfiguration just hurts 
> the user itself.
> 
> If this rationale can be agreed, then I didn't see the point of having VFIO
> to mandate all devices in the group must be attached/detached in
> lockstep. 

In theory this sounds great, but there are still too many assumptions
and too much hand waving about where isolation occurs for me to feel
like I really have the complete picture.  So let's walk through some
examples.  Please fill in and correct where I'm wrong.

1) A dual-function PCIe e1000e NIC where the functions are grouped
   together due to ACS isolation issues.

   a) Initial state: functions 0 & 1 are both bound to e1000e driver.

   b) Admin uses driverctl to bind function 1 to vfio-pci, creating
      vfio device file, which is chmod'd to grant to a user.

   c) User opens vfio function 1 device file and an iommu_fd, binds
   device_fd to iommu_fd.

   Does this succeed?
     - if no, specifically where does it fail?
     - if yes, vfio can now allow access to the device?

   d) Repeat b) for function 0.

   e) Repeat c), still using function 1, is it different?  Where?  Why?

2) The same NIC as 1)

   a) Initial state: functions 0 & 1 bound to vfio-pci, vfio device
      files granted to user, user has bound both device_fds to the same
      iommu_fd.

   AIUI, even though not bound to an IOASID, vfio can now enable access
   through the device_fds, right?  What specific entity has placed these
   devices into a block DMA state, when, and how?

   b) Both devices are attached to the same IOASID.

   Are we assuming that each device was atomically moved to the new
   IOMMU context by the IOASID code?  What if the IOMMU cannot change
   the domain atomically?

   c) The device_fd for function 1 is detached from the IOASID.

   Are we assuming the reverse of b) performed by the IOASID code?

   d) The device_fd for function 1 is unbound from the iommu_fd.

   Does this succeed?
     - if yes, what is the resulting IOMMU context of the device and
       who owns it?
     - if no, well, that results in numerous tear-down issues.

   e) Function 1 is unbound from vfio-pci.

   Does this work or is it blocked?  If blocked, by what entity
   specifically?

   f) Function 1 is bound to e1000e driver.

   We clearly have a violation here, specifically where and by who in
   this path should have prevented us from getting here or who pushes
   the BUG_ON to abort this?

3) A dual-function conventional PCI e1000 NIC where the functions are
   grouped together due to shared RID.

   a) Repeat 2.a) and 2.b) such that we have a valid, user accessible
      devices in the same IOMMU context.

   b) Function 1 is detached from the IOASID.

   I think function 1 cannot be placed into a different IOMMU context
   here, does the detach work?  What's the IOMMU context now?

   c) A new IOASID is alloc'd within the existing iommu_fd and function
      1 is attached to the new IOASID.

   Where, how, by whom does this fail?

If vfio gets to offload all of it's group management to IOASID code,
that's great, but I'm afraid that IOASID is so focused on a
device-level API that we're instead just ignoring the group dynamics
and vfio will be forced to provide oversight to maintain secure
userspace access.  Thanks,

Alex

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ