lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <MWHPR11MB18862452FD4172DCA70C89B88C569@MWHPR11MB1886.namprd11.prod.outlook.com>
Date:   Sat, 8 May 2021 07:31:18 +0000
From:   "Tian, Kevin" <kevin.tian@...el.com>
To:     Alex Williamson <alex.williamson@...hat.com>
CC:     Jason Gunthorpe <jgg@...dia.com>, "Liu, Yi L" <yi.l.liu@...el.com>,
        "Jacob Pan" <jacob.jun.pan@...ux.intel.com>,
        Auger Eric <eric.auger@...hat.com>,
        Jean-Philippe Brucker <jean-philippe@...aro.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Joerg Roedel <joro@...tes.org>,
        Lu Baolu <baolu.lu@...ux.intel.com>,
        David Woodhouse <dwmw2@...radead.org>,
        "iommu@...ts.linux-foundation.org" <iommu@...ts.linux-foundation.org>,
        "cgroups@...r.kernel.org" <cgroups@...r.kernel.org>,
        Tejun Heo <tj@...nel.org>, Li Zefan <lizefan@...wei.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Jean-Philippe Brucker <jean-philippe@...aro.com>,
        Jonathan Corbet <corbet@....net>,
        "Raj, Ashok" <ashok.raj@...el.com>, "Wu, Hao" <hao.wu@...el.com>,
        "Jiang, Dave" <dave.jiang@...el.com>
Subject: RE: [PATCH V4 05/18] iommu/ioasid: Redefine IOASID set and allocation
 APIs

> From: Alex Williamson <alex.williamson@...hat.com>
> Sent: Saturday, May 8, 2021 1:06 AM
> 
> > > Those are the main ones I can think of.  It is nice to have a simple
> > > map/unmap interface, I'd hope that a new /dev/ioasid interface wouldn't
> > > raise the barrier to entry too high, but the user needs to have the
> > > ability to have more control of their mappings and locked page
> > > accounting should probably be offloaded somewhere.  Thanks,
> > >
> >
> > Based on your feedbacks I feel it's probably reasonable to start with
> > a type1v2 semantics for the new interface. Locked accounting could
> > also start with the same VFIO restriction and then improve it
> > incrementally, if a cleaner way is intrusive (if not affecting uAPI).
> > But I didn't get the suggestion on "more control of their mappings".
> > Can you elaborate?
> 
> Things like I note above, userspace cannot currently specify mapping
> granularity nor has any visibility to the granularity they get from the
> IOMMU.  What actually happens in the IOMMU is pretty opaque to the user
> currently.  Thanks,
> 

It's much clearer. Based on all the discussions so far I'm thinking about
a staging approach when building the new interface, basically following
the model that Jason pointed out - generic stuff first, then platform 
specific extension:

Phase 1: /dev/ioasid with core ingredients and vfio type1v2 semantics
    - ioasid is the software handle representing an I/O page table
    - uAPI accepts a type1v2 map/unmap semantics per ioasid
    - helpers for VFIO/VDPA to bind ioasid_fd and attach ioasids
    - multiple ioasids are allowed without nesting (vIOMMU, or devices
w/ incompatible iommu attributes)
    - an ioasid disallows any operation before it's attached to a device
    - an ioasid inherits iommu attributes from the 1st device attached
to it
    - userspace is expected to manage hardware restrictions and the
kernel only returns error when restrictions are broken
        * map/unmap on an ioasid will fail before every device in a group 
is attached to it
        * ioasid attach will fail if the new device has incompatibile iommu
attribute as that of this ioasid
    - thus no group semantics in uAPI
    - no change to vfio container/group/type1 logic, for running existing
vfio applications
        * imply some duplication between vfio type1 and ioasid for some time
    - new uAPI in vfio to allow explicit opening of a device and then binding
it to the ioasid_fd
        * possibly require each device exposed in /dev/vfio/
    - support both pdev and mdev

Phase 2: ioasid nesting
    - Allow bind/unbind_pgtable semantics per ioasid
    - Allow ioasid nesting 
        * HW ioasid nesting if supported by platform
        * otherwise fall back to SW ioasid nesting (in-kernel shadowing)
    - iotlb invalidation per ioasid
    - I/O page fault handling per ioasid
    - hw_id is not exposed in uAPI. Vendor IOMMU driver decides
when/how hw_id is allocated and programmed properly

Phase3: optimizations and vendor extensions (order undefined, up to
the specific feature owner):
    - (Intel) ENQCMD support with hw_id exposure in uAPI
    - (ARM/AMD) RID-based pasid table assignment
    - (PPC) window-based iova management
    - Optimizations:
        * replace vfio type1 with a shim driver to use ioasid backend
        * mapping granularity
        * HW dirty page tracking
        * ...

Does above sounds a sensible plan? If yes we'll start working on 
phase1 then...

Thanks
Kevin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ