lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 07 Nov 2013 16:37:27 +0000
From:	David Woodhouse <dwmw2@...radead.org>
To:	Alex Williamson <alex.williamson@...hat.com>
Cc:	iommu@...ts.linux-foundation.org, joro@...tes.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] iommu: Split iommu_unmaps

On Fri, 2013-05-24 at 11:14 -0600, Alex Williamson wrote:
> iommu_map splits requests into pages that the iommu driver reports
> that it can handle.  The iommu_unmap path does not do the same.  This
> can cause problems not only from callers that might expect the same
> behavior as the map path, but even from the failure path of iommu_map,
> should it fail at a point where it has mapped and needs to unwind a
> set of pages that the iommu driver cannot handle directly.  amd_iommu,
> for example, will BUG_ON if asked to unmap a non power of 2 size.
> 
> Fix this by extracting and generalizing the sizing code from the
> iommu_map path and use it for both map and unmap.
> 
> Signed-off-by: Alex Williamson <alex.williamson@...hat.com>

Ick, this is horrid and looks like it will introduce a massive
performance hit.

Surely the answer is to fix the AMD driver so that it will just get on
with it and unmap the {address, range} that it's asked to map?

And fix the generic iommu_map() code while we're at it, to do it the
same way.

IOTLB flushes are *slow*, and on certain hardware with
non-cache-coherent page tables even the dcache flush for the page table
is slow. If you deliberately break it up into individual pages and do
the flush between each one, rather than just unmapping the range, you
are pessimising the performance quite hard.

A while back, I went through the Intel IOMMU code to make sure it was
doing this right — it used to have this kind of bogosity with repeated
per-page cache and IOTLB flushes *internally*, and the resulting
performance improvement was shown at http://david.woodhou.se/iommu.png

You will basically be undoing that work, by ensuring that the low-level
driver never *sees* the full range.

If the AMD driver really can't handle more than one page at a time, let
it loop for *itself* over the pages.

-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse@...el.com                              Intel Corporation

Download attachment "smime.p7s" of type "application/x-pkcs7-signature" (5745 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ