linux-kernel - RE: [PATCH 4/5] vfio/type1: Flush CPU caches on DMA pages in non-coherent domains

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN9PR11MB52766D78684F6206121590B98CED2@BN9PR11MB5276.namprd11.prod.outlook.com>
Date: Thu, 16 May 2024 08:34:20 +0000
From: "Tian, Kevin" <kevin.tian@...el.com>
To: "Zhao, Yan Y" <yan.y.zhao@...el.com>, Alex Williamson
	<alex.williamson@...hat.com>
CC: "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"x86@...nel.org" <x86@...nel.org>, "jgg@...dia.com" <jgg@...dia.com>,
	"iommu@...ts.linux.dev" <iommu@...ts.linux.dev>, "pbonzini@...hat.com"
	<pbonzini@...hat.com>, "seanjc@...gle.com" <seanjc@...gle.com>,
	"dave.hansen@...ux.intel.com" <dave.hansen@...ux.intel.com>,
	"luto@...nel.org" <luto@...nel.org>, "peterz@...radead.org"
	<peterz@...radead.org>, "tglx@...utronix.de" <tglx@...utronix.de>,
	"mingo@...hat.com" <mingo@...hat.com>, "bp@...en8.de" <bp@...en8.de>,
	"hpa@...or.com" <hpa@...or.com>, "corbet@....net" <corbet@....net>,
	"joro@...tes.org" <joro@...tes.org>, "will@...nel.org" <will@...nel.org>,
	"robin.murphy@....com" <robin.murphy@....com>, "baolu.lu@...ux.intel.com"
	<baolu.lu@...ux.intel.com>, "Liu, Yi L" <yi.l.liu@...el.com>
Subject: RE: [PATCH 4/5] vfio/type1: Flush CPU caches on DMA pages in
 non-coherent domains

> From: Zhao, Yan Y <yan.y.zhao@...el.com>
> Sent: Monday, May 13, 2024 3:11 PM
> On Fri, May 10, 2024 at 10:57:28AM -0600, Alex Williamson wrote:
> > On Fri, 10 May 2024 18:31:13 +0800
> > Yan Zhao <yan.y.zhao@...el.com> wrote:
> >
> > > On Thu, May 09, 2024 at 12:10:49PM -0600, Alex Williamson wrote:
> > > > On Tue,  7 May 2024 14:21:38 +0800
> > > > Yan Zhao <yan.y.zhao@...el.com> wrote:
> > > > > @@ -1847,6 +1891,9 @@ static void vfio_test_domain_fgsp(struct
> vfio_domain *domain, struct list_head *
> > > > >  		break;
> > > > >  	}
> > > > >
> > > > > +	if (!domain->enforce_cache_coherency)
> > > > > +		arch_clean_nonsnoop_dma(page_to_phys(pages),
> PAGE_SIZE * 2);
> > > > > +
> > > >
> > > > Seems like this use case isn't subject to the unmap aspect since these
> > > > are kernel allocated and freed pages rather than userspace pages.
> > > > There's not an "ongoing use of the page" concern.
> > > >
> > > > The window of opportunity for a device to discover and exploit the
> > > > mapping side issue appears almost impossibly small.
> > > >
> > > The concern is for a malicious device attempting DMAs automatically.
> > > Do you think this concern is valid?
> > > As there're only extra flushes for 4 pages, what about keeping it for safety?
> >
> > Userspace doesn't know anything about these mappings, so to exploit
> > them the device would somehow need to discover and interact with the
> > mapping in the split second that the mapping exists, without exposing
> > itself with mapping faults at the IOMMU.

Userspace could guess the attacking ranges based on code, e.g. currently
the code just tries to use the 1st available IOVA region which likely starts
at address 0.

and mapping faults don't stop the attack. Just some after-the-fact hint
revealing the possibility of being attacked. 😊

> >
> > I don't mind keeping the flush before map so that infinitesimal gap
> > where previous data in physical memory exposed to the device is closed,
> > but I have a much harder time seeing that the flush on unmap to
> > synchronize physical memory is required.
> >
> > For example, the potential KSM use case doesn't exist since the pages
> > are not owned by the user.  Any subsequent use of the pages would be
> > subject to the same condition we assumed after allocation, where the
> > physical data may be inconsistent with the cached data.  It's easy to

physical data can be different from the cached one at any time. In normal
case the cache line is marked as dirty and the CPU cache protocol
guarantees coherency between cache/memory.

here we talked about a situation which a malicious user uses non-coherent
DMA to bypass CPU and makes memory/cache inconsistent when the
CPU still considers the memory copy is up-to-date (e.g. cacheline is in
exclusive or shared state). In this case multiple reads from the next-user
may get different values from cache or memory depending on when the
cacheline is invalidated.

So it's really about a bad inconsistency state which can be recovered only
by invalidating the cacheline (so memory data is up-to-date) or doing
a WB-type store (to mark memory copy out-of-date) before the next-use.

> > flush 2 pages, but I think it obscures the function of the flush if we
> > can't articulate the value in this case.

btw KSM is one example. Jason mentioned in earlier discussion that not all
free pages are zero-ed before the next use then it'd always good to
conservatively prevent any potential inconsistent state leaked back to
the kernel. Though I'm not sure what'd be a real usage in which the next
user will directly use then uninitialized content w/o doing any meaningful
writes (which once done then will stop the attacking window)...

> >
> I agree the second flush is not necessary if we are confident that functions in
> between the two flushes do not and will not touch the page in CPU side.
> However, can we guarantee this? For instance, is it possible for some
> IOMMU
> driver to read/write the page for some quirks? (Or is it just a totally
> paranoid?)
> If that's not impossible, then ensuring cache and memory coherency before
> page reclaiming is better?
> 

I don't think it's a valid argument.