lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BL1PR11MB527176BE1F6DFA3D2942664D8CDD9@BL1PR11MB5271.namprd11.prod.outlook.com>
Date:   Mon, 13 Feb 2023 02:24:40 +0000
From:   "Tian, Kevin" <kevin.tian@...el.com>
To:     Jason Gunthorpe <jgg@...dia.com>,
        Alex Williamson <alex.williamson@...hat.com>
CC:     Nicolin Chen <nicolinc@...dia.com>,
        "joro@...tes.org" <joro@...tes.org>,
        "will@...nel.org" <will@...nel.org>,
        "robin.murphy@....com" <robin.murphy@....com>,
        "shuah@...nel.org" <shuah@...nel.org>,
        "Liu, Yi L" <yi.l.liu@...el.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "iommu@...ts.linux.dev" <iommu@...ts.linux.dev>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "linux-kselftest@...r.kernel.org" <linux-kselftest@...r.kernel.org>,
        "baolu.lu@...ux.intel.com" <baolu.lu@...ux.intel.com>,
        "Raj, Ashok" <ashok.raj@...el.com>
Subject: RE: [PATCH v2 02/10] iommu: Introduce a new
 iommu_group_replace_domain() API

> From: Jason Gunthorpe <jgg@...dia.com>
> Sent: Saturday, February 11, 2023 8:45 AM
> 
> On Fri, Feb 10, 2023 at 04:51:10PM -0700, Alex Williamson wrote:
> > On Tue, 7 Feb 2023 13:17:54 -0800
> > Nicolin Chen <nicolinc@...dia.com> wrote:
> >
> > > qemu has a need to replace the translations associated with a domain
> > > when the guest does large-scale operations like switching between an
> > > IDENTITY domain and, say, dma-iommu.c.
> > >
> > > Currently, it does this by replacing all the mappings in a single
> > > domain, but this is very inefficient and means that domains have to be
> > > per-device rather than per-translation.
> > >
> > > Provide a high-level API to allow replacements of one domain with
> > > another. This is similar to a detach/attach cycle except it doesn't
> > > force the group to go to the blocking domain in-between.
> > >
> > > By removing this forced blocking domain the iommu driver has the
> > > opportunity to implement an atomic replacement of the domains to the
> > > greatest extent its hardware allows.
> > >
> > > It could be possible to adderss this by simply removing the protection
> > > from the iommu_attach_group(), but it is not so clear if that is safe
> > > for the few users. Thus, add a new API to serve this new purpose.
> > >
> > > Atomic replacement allows the qemu emulation of the viommu to be
> more
> > > complete, as real hardware has this ability.
> >
> > I was under the impression that we could not atomically switch a
> > device's domain relative to in-flight DMA.

it's possible as long as the mappings for in-flight DMA don't change
in the transition.

> 
> Certainly all the HW can be proper atomic but not necessarily easily -
> the usual issue is a SW complication to manage the software controlled
> cache tags in a way that doesn't corrupt the cache.
> 
> This is because the cache tag and the io page table top are in
> different 64 bit words so atomic writes don't cover both, and thus the
> IOMMU HW could tear the two stores and mismatch the cache tag to the
> table top. This would corrupt the cache.

VT-d spec recommends using 128bit cmpxchg instruction to update
page table pointer and DID together.

> 
> The easiest way to avoid this is for SW to use the same DID for the
> new and old tables. This is possible if this translation entry is the
> only user of the DID. A more complex way would use a temporary DID
> that can be safely corrupted. But realistically I'd expect VT-d
> drivers to simply make the PASID invalid for the duration of the
> update.
> 
> However something like AMD has a single cache tag for every entry in
> the PASID table so you could do an atomic replace trivially. Just
> update the PASID and invalidate the caches.
> 
> ARM has a flexible PASID table and atomic replace would be part of
> resizing it. eg you can atomically update the top of the PASID table
> with a single 64 bit store.
> 
> So replace lets the drivers implement those special behaviors if it
> makes sense for them.
> 
> > Or maybe atomic is the wrong word here since we expect no in-flight DMA
> > during the sort of mode transitions referred to here, and we're really
> > just trying to convey that we can do this via a single operation with
> > reduced latency?  Thanks,
> 
> atomic means DMA will either translate with the old domain or the new
> domain but never a blocking domain. Keep in mind that with nesting
> "domain" can mean a full PASID table in guest memory.
> 
> I should reiterate that replace is not an API that is required to be
> atomic.
> 

yes. Just as explained in the commit message:

"
  By removing this forced blocking domain the iommu driver has the
  opportunity to implement an atomic replacement of the domains to the
  greatest extent its hardware allows.
"

API level replace only implies removing transition to/from blocking
domain. and then it's driver's call whether it wants to take this
chance to implement a true 'atomic' behavior.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ