lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240516145009.3bcd3d0c.alex.williamson@redhat.com>
Date: Thu, 16 May 2024 14:50:09 -0600
From: Alex Williamson <alex.williamson@...hat.com>
To: Yan Zhao <yan.y.zhao@...el.com>
Cc: <kvm@...r.kernel.org>, <linux-kernel@...r.kernel.org>, <x86@...nel.org>,
 <jgg@...dia.com>, <kevin.tian@...el.com>, <iommu@...ts.linux.dev>,
 <pbonzini@...hat.com>, <seanjc@...gle.com>, <dave.hansen@...ux.intel.com>,
 <luto@...nel.org>, <peterz@...radead.org>, <tglx@...utronix.de>,
 <mingo@...hat.com>, <bp@...en8.de>, <hpa@...or.com>, <corbet@....net>,
 <joro@...tes.org>, <will@...nel.org>, <robin.murphy@....com>,
 <baolu.lu@...ux.intel.com>, <yi.l.liu@...el.com>
Subject: Re: [PATCH 4/5] vfio/type1: Flush CPU caches on DMA pages in
 non-coherent domains

On Mon, 13 May 2024 15:11:28 +0800
Yan Zhao <yan.y.zhao@...el.com> wrote:

> On Fri, May 10, 2024 at 10:57:28AM -0600, Alex Williamson wrote:
> > On Fri, 10 May 2024 18:31:13 +0800
> > Yan Zhao <yan.y.zhao@...el.com> wrote:
> >   
> > > On Thu, May 09, 2024 at 12:10:49PM -0600, Alex Williamson wrote:  
> > > > On Tue,  7 May 2024 14:21:38 +0800
> > > > Yan Zhao <yan.y.zhao@...el.com> wrote:    
> > > ...   
> > > > >  drivers/vfio/vfio_iommu_type1.c | 51 +++++++++++++++++++++++++++++++++
> > > > >  1 file changed, 51 insertions(+)
> > > > > 
> > > > > diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> > > > > index b5c15fe8f9fc..ce873f4220bf 100644
> > > > > --- a/drivers/vfio/vfio_iommu_type1.c
> > > > > +++ b/drivers/vfio/vfio_iommu_type1.c
> > > > > @@ -74,6 +74,7 @@ struct vfio_iommu {
> > > > >  	bool			v2;
> > > > >  	bool			nesting;
> > > > >  	bool			dirty_page_tracking;
> > > > > +	bool			has_noncoherent_domain;
> > > > >  	struct list_head	emulated_iommu_groups;
> > > > >  };
> > > > >  
> > > > > @@ -99,6 +100,7 @@ struct vfio_dma {
> > > > >  	unsigned long		*bitmap;
> > > > >  	struct mm_struct	*mm;
> > > > >  	size_t			locked_vm;
> > > > > +	bool			cache_flush_required; /* For noncoherent domain */    
> > > > 
> > > > Poor packing, minimally this should be grouped with the other bools in
> > > > the structure, longer term they should likely all be converted to
> > > > bit fields.    
> > > Yes. Will do!
> > >   
> > > >     
> > > > >  };
> > > > >  
> > > > >  struct vfio_batch {
> > > > > @@ -716,6 +718,9 @@ static long vfio_unpin_pages_remote(struct vfio_dma *dma, dma_addr_t iova,
> > > > >  	long unlocked = 0, locked = 0;
> > > > >  	long i;
> > > > >  
> > > > > +	if (dma->cache_flush_required)
> > > > > +		arch_clean_nonsnoop_dma(pfn << PAGE_SHIFT, npage << PAGE_SHIFT);
> > > > > +
> > > > >  	for (i = 0; i < npage; i++, iova += PAGE_SIZE) {
> > > > >  		if (put_pfn(pfn++, dma->prot)) {
> > > > >  			unlocked++;
> > > > > @@ -1099,6 +1104,8 @@ static long vfio_unmap_unpin(struct vfio_iommu *iommu, struct vfio_dma *dma,
> > > > >  					    &iotlb_gather);
> > > > >  	}
> > > > >  
> > > > > +	dma->cache_flush_required = false;
> > > > > +
> > > > >  	if (do_accounting) {
> > > > >  		vfio_lock_acct(dma, -unlocked, true);
> > > > >  		return 0;
> > > > > @@ -1120,6 +1127,21 @@ static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma)
> > > > >  	iommu->dma_avail++;
> > > > >  }
> > > > >  
> > > > > +static void vfio_update_noncoherent_domain_state(struct vfio_iommu *iommu)
> > > > > +{
> > > > > +	struct vfio_domain *domain;
> > > > > +	bool has_noncoherent = false;
> > > > > +
> > > > > +	list_for_each_entry(domain, &iommu->domain_list, next) {
> > > > > +		if (domain->enforce_cache_coherency)
> > > > > +			continue;
> > > > > +
> > > > > +		has_noncoherent = true;
> > > > > +		break;
> > > > > +	}
> > > > > +	iommu->has_noncoherent_domain = has_noncoherent;
> > > > > +}    
> > > > 
> > > > This should be merged with vfio_domains_have_enforce_cache_coherency()
> > > > and the VFIO_DMA_CC_IOMMU extension (if we keep it, see below).    
> > > Will convert it to a counter and do the merge.
> > > Thanks for pointing it out!
> > >   
> > > >     
> > > > > +
> > > > >  static void vfio_update_pgsize_bitmap(struct vfio_iommu *iommu)
> > > > >  {
> > > > >  	struct vfio_domain *domain;
> > > > > @@ -1455,6 +1477,12 @@ static int vfio_pin_map_dma(struct vfio_iommu *iommu, struct vfio_dma *dma,
> > > > >  
> > > > >  	vfio_batch_init(&batch);
> > > > >  
> > > > > +	/*
> > > > > +	 * Record necessity to flush CPU cache to make sure CPU cache is flushed
> > > > > +	 * for both pin & map and unmap & unpin (for unwind) paths.
> > > > > +	 */
> > > > > +	dma->cache_flush_required = iommu->has_noncoherent_domain;
> > > > > +
> > > > >  	while (size) {
> > > > >  		/* Pin a contiguous chunk of memory */
> > > > >  		npage = vfio_pin_pages_remote(dma, vaddr + dma->size,
> > > > > @@ -1466,6 +1494,10 @@ static int vfio_pin_map_dma(struct vfio_iommu *iommu, struct vfio_dma *dma,
> > > > >  			break;
> > > > >  		}
> > > > >  
> > > > > +		if (dma->cache_flush_required)
> > > > > +			arch_clean_nonsnoop_dma(pfn << PAGE_SHIFT,
> > > > > +						npage << PAGE_SHIFT);
> > > > > +
> > > > >  		/* Map it! */
> > > > >  		ret = vfio_iommu_map(iommu, iova + dma->size, pfn, npage,
> > > > >  				     dma->prot);
> > > > > @@ -1683,9 +1715,14 @@ static int vfio_iommu_replay(struct vfio_iommu *iommu,
> > > > >  	for (; n; n = rb_next(n)) {
> > > > >  		struct vfio_dma *dma;
> > > > >  		dma_addr_t iova;
> > > > > +		bool cache_flush_required;
> > > > >  
> > > > >  		dma = rb_entry(n, struct vfio_dma, node);
> > > > >  		iova = dma->iova;
> > > > > +		cache_flush_required = !domain->enforce_cache_coherency &&
> > > > > +				       !dma->cache_flush_required;
> > > > > +		if (cache_flush_required)
> > > > > +			dma->cache_flush_required = true;    
> > > > 
> > > > The variable name here isn't accurate and the logic is confusing.  If
> > > > the domain does not enforce coherency and the mapping is not tagged as
> > > > requiring a cache flush, then we need to mark the mapping as requiring
> > > > a cache flush.  So the variable state is something more akin to
> > > > set_cache_flush_required.  But all we're saving with this is a
> > > > redundant set if the mapping is already tagged as requiring a cache
> > > > flush, so it could really be simplified to:
> > > > 
> > > > 		dma->cache_flush_required = !domain->enforce_cache_coherency;    
> > > Sorry about the confusion.
> > > 
> > > If dma->cache_flush_required is set to true by a domain not enforcing cache
> > > coherency, we hope it will not be reset to false by a later attaching to domain 
> > > enforcing cache coherency due to the lazily flushing design.  
> > 
> > Right, ok, the vfio_dma objects are shared between domains so we never
> > want to set 'dma->cache_flush_required = false' due to the addition of a
> > 'domain->enforce_cache_coherent == true'.  So this could be:
> > 
> > 	if (!dma->cache_flush_required)
> > 		dma->cache_flush_required = !domain->enforce_cache_coherency;  
> 
> Though this code is easier for understanding, it leads to unnecessary setting of
> dma->cache_flush_required to false, given domain->enforce_cache_coherency is
> true at the most time.

I don't really see that as an issue, but the variable name originally
chosen above, cache_flush_required, also doesn't convey that it's only
attempting to set the value if it wasn't previously set and is now
required by a noncoherent domain.

> > > > It might add more clarity to just name the mapping flag
> > > > dma->mapped_noncoherent.    
> > > 
> > > The dma->cache_flush_required is to mark whether pages in a vfio_dma requires
> > > cache flush in the subsequence mapping into the first non-coherent domain
> > > and page unpinning.  
> > 
> > How do we arrive at a sequence where we have dma->cache_flush_required
> > that isn't the result of being mapped into a domain with
> > !domain->enforce_cache_coherency?  
> Hmm, dma->cache_flush_required IS the result of being mapped into a domain with
> !domain->enforce_cache_coherency.
> My concern only arrives from the actual code sequence, i.e.
> dma->cache_flush_required is set to true before the actual mapping.
> 
> If we rename it to dma->mapped_noncoherent and only set it to true after the
> actual successful mapping, it would lead to more code to handle flushing for the
> unwind case.
> Currently, flush for unwind is handled centrally in vfio_unpin_pages_remote()
> by checking dma->cache_flush_required, which is true even before a full
> successful mapping, so we won't miss flush on any pages that are mapped into a
> non-coherent domain in a short window.

I don't think we need to be so literal that "mapped_noncoherent" can
only be set after the vfio_dma is fully mapped to a noncoherent domain,
but also we can come up with other names for the flag.  Perhaps
"is_noncoherent".  My suggestion was more from the perspective of what
does the flag represent rather than what we intend to do as a result of
the flag being set.  Thanks,

Alex


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ