[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150501042319.GM24886@voom.redhat.com>
Date: Fri, 1 May 2015 14:23:19 +1000
From: David Gibson <david@...son.dropbear.id.au>
To: Alexey Kardashevskiy <aik@...abs.ru>
Cc: linuxppc-dev@...ts.ozlabs.org,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Paul Mackerras <paulus@...ba.org>,
Alex Williamson <alex.williamson@...hat.com>,
Gavin Shan <gwshan@...ux.vnet.ibm.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH kernel v9 26/32] powerpc/iommu: Add userspace view of TCE
table
On Fri, May 01, 2015 at 02:01:17PM +1000, Alexey Kardashevskiy wrote:
> On 04/29/2015 04:31 PM, David Gibson wrote:
> >On Sat, Apr 25, 2015 at 10:14:50PM +1000, Alexey Kardashevskiy wrote:
> >>In order to support memory pre-registration, we need a way to track
> >>the use of every registered memory region and only allow unregistration
> >>if a region is not in use anymore. So we need a way to tell from what
> >>region the just cleared TCE was from.
> >>
> >>This adds a userspace view of the TCE table into iommu_table struct.
> >>It contains userspace address, one per TCE entry. The table is only
> >>allocated when the ownership over an IOMMU group is taken which means
> >>it is only used from outside of the powernv code (such as VFIO).
> >>
> >>Signed-off-by: Alexey Kardashevskiy <aik@...abs.ru>
> >>---
> >>Changes:
> >>v9:
> >>* fixed code flow in error cases added in v8
> >>
> >>v8:
> >>* added ENOMEM on failed vzalloc()
> >>---
> >> arch/powerpc/include/asm/iommu.h | 6 ++++++
> >> arch/powerpc/kernel/iommu.c | 18 ++++++++++++++++++
> >> arch/powerpc/platforms/powernv/pci-ioda.c | 22 ++++++++++++++++++++--
> >> 3 files changed, 44 insertions(+), 2 deletions(-)
> >>
> >>diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h
> >>index 7694546..1472de3 100644
> >>--- a/arch/powerpc/include/asm/iommu.h
> >>+++ b/arch/powerpc/include/asm/iommu.h
> >>@@ -111,9 +111,15 @@ struct iommu_table {
> >> unsigned long *it_map; /* A simple allocation bitmap for now */
> >> unsigned long it_page_shift;/* table iommu page size */
> >> struct iommu_table_group *it_table_group;
> >>+ unsigned long *it_userspace; /* userspace view of the table */
> >
> >A single unsigned long doesn't seem like enough.
>
> Why single? This is an array.
As in single per page.
> > How do you know
> >which process's address space this address refers to?
>
> It is a current task. Multiple userspaces cannot use the same container/tables.
Where is that enforced?
More to the point, that's a VFIO constraint, but it's here affecting
the design of a structure owned by the platform code.
[snip]
> >> static void pnv_pci_ioda_setup_opal_tce_kill(struct pnv_phb *phb,
> >>@@ -2062,12 +2071,21 @@ static long pnv_pci_ioda2_create_table(struct iommu_table_group *table_group,
> >> int nid = pe->phb->hose->node;
> >> __u64 bus_offset = num ? pe->tce_bypass_base : 0;
> >> long ret;
> >>+ unsigned long *uas, uas_cb = sizeof(*uas) * (window_size >> page_shift);
> >>+
> >>+ uas = vzalloc(uas_cb);
> >>+ if (!uas)
> >>+ return -ENOMEM;
> >
> >I don't see why this is allocated both here as well as in
> >take_ownership.
>
> Where else? The only alternative is vfio_iommu_spapr_tce but I really do not
> want to touch iommu_table fields there.
Well to put it another way, why isn't take_ownership calling create
itself (or at least a common helper).
Clearly the it_userspace table needs to have lifetime which matches
the TCE table itself, so there should be a single function that marks
the beginning of that joint lifetime.
> >Isn't this function used for core-kernel users of the
> >iommu as well, in which case it shouldn't need the it_userspace.
>
>
> No. This is an iommu_table_group_ops callback which calls what the platform
> code calls (pnv_pci_create_table()) plus allocates this it_userspace thing.
> The callback is only called from VFIO.
Ok.
As touched on above it seems more like this should be owned by VFIO
code than the platform code.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
Content of type "application/pgp-signature" skipped
Powered by blists - more mailing lists