lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210426123817.GQ1370958@nvidia.com>
Date:   Mon, 26 Apr 2021 09:38:17 -0300
From:   Jason Gunthorpe <jgg@...dia.com>
To:     "Tian, Kevin" <kevin.tian@...el.com>
Cc:     Alex Williamson <alex.williamson@...hat.com>,
        "Liu, Yi L" <yi.l.liu@...el.com>,
        Jacob Pan <jacob.jun.pan@...ux.intel.com>,
        Auger Eric <eric.auger@...hat.com>,
        Jean-Philippe Brucker <jean-philippe@...aro.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Joerg Roedel <joro@...tes.org>,
        Lu Baolu <baolu.lu@...ux.intel.com>,
        David Woodhouse <dwmw2@...radead.org>,
        "iommu@...ts.linux-foundation.org" <iommu@...ts.linux-foundation.org>,
        "cgroups@...r.kernel.org" <cgroups@...r.kernel.org>,
        Tejun Heo <tj@...nel.org>, Li Zefan <lizefan@...wei.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Jean-Philippe Brucker <jean-philippe@...aro.com>,
        Jonathan Corbet <corbet@....net>,
        "Raj, Ashok" <ashok.raj@...el.com>, "Wu, Hao" <hao.wu@...el.com>,
        "Jiang, Dave" <dave.jiang@...el.com>
Subject: Re: [PATCH V4 05/18] iommu/ioasid: Redefine IOASID set and
 allocation APIs

On Sun, Apr 25, 2021 at 09:24:46AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@...dia.com>
> > Sent: Friday, April 23, 2021 7:50 PM
> > 
> > On Fri, Apr 23, 2021 at 09:06:44AM +0000, Tian, Kevin wrote:
> > 
> > > Or could we still have just one /dev/ioasid but allow userspace to create
> > > multiple gpa_ioasid_id's each associated to a different iommu domain?
> > > Then the compatibility check will be done at ATTACH_IOASID instead of
> > > JOIN_IOASID_FD.
> > 
> > To my mind what makes sense that that /dev/ioasid presents a single
> > IOMMU behavior that is basically the same. This may ultimately not be
> > what we call a domain today.
> > 
> > We may end up with a middle object which is a group of domains that
> > all have the same capabilities, and we define capabilities in a way
> > that most platforms have a single group of domains.
> > 
> > The key capability of a group of domains is they can all share the HW
> > page table representation, so if an IOASID instantiates a page table
> > it can be assigned to any device on any domain in the gruop of domains.
> 
> Sorry that I didn't quite get it. If a group of domains can share the 
> same page table then why not just attaching all devices under those
> domains into a single domain?

Sure, if that works. But you shouldn't have things like IOMMU_CACHE
create different domains or trigger different /dev/ioasid's

> to describe the HW page table. Ideally a new iommu domain should
> be created only when it's impossible to share an existing page table. 
> Otherwise you'll get bad iotlb efficiency because each domain has its
> unique domain id (tagged in iotlb) then duplicated iotlb entries may
> exist even when a single page table is shared by those domains.

Right, fewer is better

> Or, can you elaborate what is the targeted usage by having a group of
> domains which all share the same page table?

You just need to have clear rule what what requires a new /dev/ioasid
FD - and if it maps to domains then great.

> Want to hear your opinion for one open here. There is no doubt that
> an ioasid represents a HW page table when the table is constructed by 
> userspace and then linked to the IOMMU through the bind/unbind
> API. But I'm not very sure about whether an ioasid should represent 
> the exact pgtable or the mapping metadata when the underlying 
> pgtable is indirectly constructed through map/unmap API. VFIO does
> the latter way, which is why it allows multiple incompatible domains
> in a single container which all share the same mapping metadata.

I think VFIO's map/unmap is way too complex and we know it has bad
performance problems. 

If /dev/ioasid is single HW page table only then I would focus on that
implementation and leave it to userspace to span different
/dev/ioasids if needed.

> OK, now I see where the disconnection comes from. In my context ioasid
> is the identifier that is actually used in the wire, but seems you treat it as 
> a sw-defined namespace purely for representing page tables. We should 
> clear this concept first before further discussing other details. 😊

There is no general HW requirement that every IO page table be
referred to by the same PASID and this API would have to support
non-PASID IO page tables as well. So I'd keep the two things
separated in the uAPI - even though the kernel today has a global
PASID pool.

> Then following your proposal, does it mean that we need another
> interface for allocating PASID? and since ioasid means different
> thing in uAPI and in-kernel API, possibly a new name is required to
> avoid confusion?

I would suggest have two ways to control the PASID

 1) Over /dev/ioasid allocate a PASID for an IOASID. All future PASID
    based usages of the IOASID will use that global PASID

 2) Over the device FD, when the IOASID is bound return the PASID that
    was selected. If the IOASID does not have a global PASID then the
    kernel is free to make something up. In this mode a single IOASID
    can have multiple PASIDs.

Simple things like DPDK can use #2 and potentially have better PASID
limits. hypervisors will most likely have to use #1, but it depends on
how their vIOMMU interface works.

I think the name IOASID is fine for the uAPI, the kernel version can
be called ioasid_id or something.

(also looking at ioasid.c, why do we need such a thin and odd wrapper
around xarray?)

Jason

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ