[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150430042107.GE15238@voom.redhat.com>
Date: Thu, 30 Apr 2015 14:21:07 +1000
From: David Gibson <david@...son.dropbear.id.au>
To: Alexey Kardashevskiy <aik@...abs.ru>
Cc: linuxppc-dev@...ts.ozlabs.org,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Paul Mackerras <paulus@...ba.org>,
Alex Williamson <alex.williamson@...hat.com>,
Gavin Shan <gwshan@...ux.vnet.ibm.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH kernel v9 18/32] powerpc/iommu/powernv: Release replaced
TCE
On Wed, Apr 29, 2015 at 07:51:21PM +1000, Alexey Kardashevskiy wrote:
> On 04/29/2015 02:18 PM, David Gibson wrote:
> >On Sat, Apr 25, 2015 at 10:14:42PM +1000, Alexey Kardashevskiy wrote:
> >>At the moment writing new TCE value to the IOMMU table fails with EBUSY
> >>if there is a valid entry already. However PAPR specification allows
> >>the guest to write new TCE value without clearing it first.
> >>
> >>Another problem this patch is addressing is the use of pool locks for
> >>external IOMMU users such as VFIO. The pool locks are to protect
> >>DMA page allocator rather than entries and since the host kernel does
> >>not control what pages are in use, there is no point in pool locks and
> >>exchange()+put_page(oldtce) is sufficient to avoid possible races.
> >>
> >>This adds an exchange() callback to iommu_table_ops which does the same
> >>thing as set() plus it returns replaced TCE and DMA direction so
> >>the caller can release the pages afterwards. The exchange() receives
> >>a physical address unlike set() which receives linear mapping address;
> >>and returns a physical address as the clear() does.
> >>
> >>This implements exchange() for P5IOC2/IODA/IODA2. This adds a requirement
> >>for a platform to have exchange() implemented in order to support VFIO.
> >>
> >>This replaces iommu_tce_build() and iommu_clear_tce() with
> >>a single iommu_tce_xchg().
> >>
> >>This makes sure that TCE permission bits are not set in TCE passed to
> >>IOMMU API as those are to be calculated by platform code from DMA direction.
> >>
> >>This moves SetPageDirty() to the IOMMU code to make it work for both
> >>VFIO ioctl interface in in-kernel TCE acceleration (when it becomes
> >>available later).
> >>
> >>Signed-off-by: Alexey Kardashevskiy <aik@...abs.ru>
> >>[aw: for the vfio related changes]
> >>Acked-by: Alex Williamson <alex.williamson@...hat.com>
> >
> >This looks mostly good, but there are couple of details that need fixing.
> >
>
>
> [...]
>
> >>diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c
> >>index ba75aa5..e8802ac 100644
> >>--- a/arch/powerpc/platforms/powernv/pci.c
> >>+++ b/arch/powerpc/platforms/powernv/pci.c
> >>@@ -598,6 +598,23 @@ int pnv_tce_build(struct iommu_table *tbl, long index, long npages,
> >> return 0;
> >> }
> >>
> >>+#ifdef CONFIG_IOMMU_API
> >>+int pnv_tce_xchg(struct iommu_table *tbl, long index,
> >>+ unsigned long *tce, enum dma_data_direction *direction)
> >>+{
> >>+ u64 proto_tce = iommu_direction_to_tce_perm(*direction);
> >>+ unsigned long newtce = *tce | proto_tce;
> >>+ unsigned long idx = index - tbl->it_offset;
> >
> >Should this have a BUG_ON or WARN_ON if the supplied tce has bits set
> >below the page mask?
>
>
> Why? The caller checks these bits, do we really need to duplicate it
> here?
Because this is the crunch point where bad bits will actually cause
strange stuff to happen.
As much as anything the point of a BUG_ON would be to document that
this function expects the parameter to be aligned.
>
>
> >>+ *tce = xchg(pnv_tce(tbl, idx), cpu_to_be64(newtce));
> >>+ *tce = be64_to_cpu(*tce);
> >>+ *direction = iommu_tce_direction(*tce);
> >>+ *tce &= ~(TCE_PCI_READ | TCE_PCI_WRITE);
> >>+
> >>+ return 0;
> >>+}
> >>+#endif
>
>
>
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
Content of type "application/pgp-signature" skipped
Powered by blists - more mailing lists