[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <55541B6C.1000903@ozlabs.ru>
Date: Thu, 14 May 2015 13:50:04 +1000
From: Alexey Kardashevskiy <aik@...abs.ru>
To: Gavin Shan <gwshan@...ux.vnet.ibm.com>
CC: linuxppc-dev@...ts.ozlabs.org,
David Gibson <david@...son.dropbear.id.au>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Paul Mackerras <paulus@...ba.org>,
Alex Williamson <alex.williamson@...hat.com>,
Wei Yang <weiyang@...ux.vnet.ibm.com>,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH kernel v10 21/34] powerpc/powernv/ioda2: Add TCE invalidation
for all attached groups
On 05/14/2015 12:22 PM, Gavin Shan wrote:
> On Tue, May 12, 2015 at 01:39:10AM +1000, Alexey Kardashevskiy wrote:
>> The iommu_table struct keeps a list of IOMMU groups it is used for.
>> At the moment there is just a single group attached but further
>> patches will add TCE table sharing. When sharing is enabled, TCE cache
>> in each PE needs to be invalidated so does the patch.
>>
>> This does not change pnv_pci_ioda1_tce_invalidate() as there is no plan
>> to enable TCE table sharing on PHBs older than IODA2.
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@...abs.ru>
>> ---
>> Changes:
>> v10:
>> * new to the series
>> ---
>> arch/powerpc/platforms/powernv/pci-ioda.c | 35 ++++++++++++++++++++-----------
>> 1 file changed, 23 insertions(+), 12 deletions(-)
>>
>> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
>> index f972e40..8e4987d 100644
>> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
>> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
>> @@ -24,6 +24,7 @@
>> #include <linux/msi.h>
>> #include <linux/memblock.h>
>> #include <linux/iommu.h>
>> +#include <linux/rculist.h>
>>
>> #include <asm/sections.h>
>> #include <asm/io.h>
>> @@ -1763,23 +1764,15 @@ static inline void pnv_pci_ioda2_tvt_invalidate(struct pnv_ioda_pe *pe)
>> __raw_writeq(cpu_to_be64(val), pe->tce_inval_reg);
>> }
>>
>> -static void pnv_pci_ioda2_tce_invalidate(struct iommu_table *tbl,
>> - unsigned long index, unsigned long npages, bool rm)
>> +static void pnv_pci_ioda2_tce_do_invalidate(unsigned pe_number, bool rm,
>> + __be64 __iomem *invalidate, unsigned shift,
>> + unsigned long index, unsigned long npages)
>> {
>> - struct iommu_table_group_link *tgl = list_first_entry_or_null(
>> - &tbl->it_group_list, struct iommu_table_group_link,
>> - next);
>> - struct pnv_ioda_pe *pe = container_of(tgl->table_group,
>> - struct pnv_ioda_pe, table_group);
>> unsigned long start, end, inc;
>> - __be64 __iomem *invalidate = rm ?
>> - (__be64 __iomem *)pe->tce_inval_reg_phys :
>> - pe->tce_inval_reg;
>> - const unsigned shift = tbl->it_page_shift;
>>
>> /* We'll invalidate DMA address in PE scope */
>> start = 0x2ull << 60;
>> - start |= (pe->pe_number & 0xFF);
>> + start |= (pe_number & 0xFF);
>> end = start;
>>
>> /* Figure out the start, end and step */
>> @@ -1797,6 +1790,24 @@ static void pnv_pci_ioda2_tce_invalidate(struct iommu_table *tbl,
>> }
>> }
>>
>> +static void pnv_pci_ioda2_tce_invalidate(struct iommu_table *tbl,
>> + unsigned long index, unsigned long npages, bool rm)
>> +{
>> + struct iommu_table_group_link *tgl;
>> +
>> + list_for_each_entry_rcu(tgl, &tbl->it_group_list, next) {
>> + struct pnv_ioda_pe *pe = container_of(tgl->table_group,
>> + struct pnv_ioda_pe, table_group);
>> + __be64 __iomem *invalidate = rm ?
>> + (__be64 __iomem *)pe->tce_inval_reg_phys :
>> + pe->tce_inval_reg;
>> +
>> + pnv_pci_ioda2_tce_do_invalidate(pe->pe_number, rm,
>> + invalidate, tbl->it_page_shift,
>> + index, npages);
>> + }
>> +}
>> +
>
> I don't understand this well and need a teaching session: One IOMMU
> table can be connected with multiple IOMMU table groups, each of them
> can be regarded as being equal to one PE. It means one IOMMU table
> can be shared by two PEs. There must be something I missed.
No, this is correct.
> Could you give a teaching session with an example about the IOMMU
> table sharing? :-)
If you do not share tables and you have multiple IOMMU groups passed to
QEMU, and all actual devices are capable of 64bit DMA, and you have
multiple PHBs in QEMU (each backed with a 64bit TCE table which is updated
once at the boot time and never changes) - all these tables will have
exactly the same content.
Another thing is if you do not want to have multiple PHBs in QEMU, and you
do not have tables sharing, every H_PUT_TCE request would have to update
each group's TCE table, not just one. Not very fast approach.
So it seems a useful thing. If you do not want sharing, just add another
virtual PHB and put vfio-pci devices onto it.
--
Alexey
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists