lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230106102450.2e6c70bb.alex.williamson@redhat.com>
Date:   Fri, 6 Jan 2023 10:24:50 -0700
From:   Alex Williamson <alex.williamson@...hat.com>
To:     Niklas Schnelle <schnelle@...ux.ibm.com>
Cc:     Cornelia Huck <cohuck@...hat.com>, Jason Gunthorpe <jgg@...pe.ca>,
        kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-s390@...r.kernel.org,
        Matthew Rosato <mjrosato@...ux.ibm.com>,
        Pierre Morel <pmorel@...ux.ibm.com>,
        Christian Bornträger 
        <borntraeger@...ux.ibm.com>
Subject: Re: [PATCH v2 1/1] vfio/type1: Respect IOMMU reserved regions in
 vfio_test_domain_fgsp()

On Wed,  4 Jan 2023 16:42:02 +0100
Niklas Schnelle <schnelle@...ux.ibm.com> wrote:

> Since commit cbf7827bc5dc ("iommu/s390: Fix potential s390_domain
> aperture shrinking") the s390 IOMMU driver uses reserved regions for the

Are you asking for this in v6.2?  Seems like the above was introduced
in v6.2 and I can't tell if this is sufficiently prevalent that we need
a fix in the same release.

> system provided DMA ranges of PCI devices. Previously it reduced the
> size of the IOMMU aperture and checked it on each mapping operation.
> On current machines the system denies use of DMA addresses below 2^32 for
> all PCI devices.
> 
> Usually mapping IOVAs in a reserved regions is harmless until a DMA
> actually tries to utilize the mapping. However on s390 there is
> a virtual PCI device called ISM which is implemented in firmware and
> used for cross LPAR communication. Unlike real PCI devices this device
> does not use the hardware IOMMU but inspects IOMMU translation tables
> directly on IOTLB flush (s390 RPCIT instruction). If it detects IOVA
> mappings outside the allowed ranges it goes into an error state. This
> error state then causes the device to be unavailable to the KVM guest.
> 
> Analysing this we found that vfio_test_domain_fgsp() maps 2 pages at DMA
> address 0 irrespective of the IOMMUs reserved regions. Even if usually
> harmless this seems wrong in the general case so instead go through the
> freshly updated IOVA list and try to find a range that isn't reserved,
> and fits 2 pages, is PAGE_SIZE * 2 aligned. If found use that for
> testing for fine grained super pages.
> 
> Fixes: 6fe1010d6d9c ("vfio/type1: DMA unmap chunking")

Nit, the above patch pre-dates any notion of reserved regions, so isn't
this actually fixing the implementation of reserved regions in type1 to
include this test?  ie.

Fixes: af029169b8fd ("vfio/type1: Check reserved region conflict and update iovalist")

> Reported-by: Matthew Rosato <mjrosato@...ux.ibm.com>
> Signed-off-by: Niklas Schnelle <schnelle@...ux.ibm.com>
> ---
> v1 -> v2:
> - Reworded commit message to hopefully explain things a bit better and
>   highlight that usually just mapping but not issuing DMAs for IOVAs in
>   a resverved region is harmless but still breaks things with ISM devices.
> - Added a check for PAGE_SIZE * 2 alignment (Jason)
> 
>  drivers/vfio/vfio_iommu_type1.c | 30 +++++++++++++++++++-----------
>  1 file changed, 19 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 23c24fe98c00..87b27ffb93d0 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -1856,24 +1856,32 @@ static int vfio_iommu_replay(struct vfio_iommu *iommu,
>   * significantly boosts non-hugetlbfs mappings and doesn't seem to hurt when
>   * hugetlbfs is in use.
>   */
> -static void vfio_test_domain_fgsp(struct vfio_domain *domain)
> +static void vfio_test_domain_fgsp(struct vfio_domain *domain, struct list_head *regions)
>  {
> -	struct page *pages;
>  	int ret, order = get_order(PAGE_SIZE * 2);
> +	struct vfio_iova *region;
> +	struct page *pages;
>  
>  	pages = alloc_pages(GFP_KERNEL | __GFP_ZERO, order);
>  	if (!pages)
>  		return;
>  
> -	ret = iommu_map(domain->domain, 0, page_to_phys(pages), PAGE_SIZE * 2,
> -			IOMMU_READ | IOMMU_WRITE | IOMMU_CACHE);
> -	if (!ret) {
> -		size_t unmapped = iommu_unmap(domain->domain, 0, PAGE_SIZE);
> +	list_for_each_entry(region, regions, list) {
> +		if (region->end - region->start < PAGE_SIZE * 2 ||
> +				region->start % (PAGE_SIZE*2))

Maybe this falls into the noise, but we don't care if region->start is
aligned to a double page, so long as we can map an aligned double page
within the region.  Maybe something like:

	dma_addr_t start = ALIGN(region->start, PAGE_SIZE * 2);

	if (start >= region->end || (region->end - start < PAGE_SIZE * 2))
		continue;


s/region->// for below if so.  Thanks,

Alex

> +			continue;
>  
> -		if (unmapped == PAGE_SIZE)
> -			iommu_unmap(domain->domain, PAGE_SIZE, PAGE_SIZE);
> -		else
> -			domain->fgsp = true;
> +		ret = iommu_map(domain->domain, region->start, page_to_phys(pages), PAGE_SIZE * 2,
> +				IOMMU_READ | IOMMU_WRITE | IOMMU_CACHE);
> +		if (!ret) {
> +			size_t unmapped = iommu_unmap(domain->domain, region->start, PAGE_SIZE);
> +
> +			if (unmapped == PAGE_SIZE)
> +				iommu_unmap(domain->domain, region->start + PAGE_SIZE, PAGE_SIZE);
> +			else
> +				domain->fgsp = true;
> +		}
> +		break;
>  	}
>  
>  	__free_pages(pages, order);
> @@ -2326,7 +2334,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
>  		}
>  	}
>  
> -	vfio_test_domain_fgsp(domain);
> +	vfio_test_domain_fgsp(domain, &iova_copy);
>  
>  	/* replay mappings on new domains */
>  	ret = vfio_iommu_replay(iommu, domain);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ