lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN9PR11MB5276C98AADBE65795DAC5CB28C8CA@BN9PR11MB5276.namprd11.prod.outlook.com>
Date: Thu, 15 Jan 2026 05:44:08 +0000
From: "Tian, Kevin" <kevin.tian@...el.com>
To: Jason Gunthorpe <jgg@...dia.com>
CC: Baolu Lu <baolu.lu@...ux.intel.com>, Samiullah Khawaja
	<skhawaja@...gle.com>, Joerg Roedel <joro@...tes.org>, Will Deacon
	<will@...nel.org>, Robin Murphy <robin.murphy@....com>, Dmytro Maluka
	<dmaluka@...omium.org>, "iommu@...ts.linux.dev" <iommu@...ts.linux.dev>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH 3/3] iommu/vt-d: Rework hitless PASID entry replacement

> From: Jason Gunthorpe <jgg@...dia.com>
> Sent: Wednesday, January 14, 2026 9:17 PM
> 
> On Wed, Jan 14, 2026 at 07:26:10AM +0000, Tian, Kevin wrote:
> > before cache is flushed, it may contain:
> >
> >  - entries tagged with old DID, with content loaded from old table
> >  - entries tagged with old DID, with content loaded from new table
> >  - entries tagged with new DID, with content loaded from new table
> >
> > Compared to 2nd-stage the only problematic one is old DID + new table.
> >
> > According to 6.2.1 (Tagging of Cached Translations), the root address
> > of page table is not used in tagging and DID-based invalidation will
> > flush all entries related to old DID (no matter it's from old or new table).
> >
> > Then it should just work!
> 
> Unless the original domain is attached to another device, then you've
> corrupted the DID and corrupted IOTLB for the second innocent device
> that isn't changing translation.

ah, yes, that's the key point!

> 
> > p.s. Jason said that atomic size is 128bit on AMD and 64bit on ARM.
> > they both have DID concept and two page table pointers. So I assume
> > it's the same case on this front?
> 
> Hmm, yeah, ARM has it worse you can't change any ASID/VMID
> concurrently with the table pointer.
> 
> You could make a safe algorithm by allocating a temporary ID, moving
> the current entry to the temporary ID, moving to the new pointer,
> moving to the final ID, then flushing the tempoary ID.

right, that ensures the corrupted state is only associated with the
temporary ID.

> 
> It avoids the cross device issue and your logic above would hold.
> 
> Or maybe the case Samiullah is interested in should have the new
> domain adopt the original ID..
> 

that also works. the KHO resume process could have a special DID
allocation scheme to reuse the original one.

or in Samiullah's case the old/new domains always contains the
same mappings, so no corruption would ever occur, but that'd be
a very KHO specific assumption. 😊

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ