lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Thu, 12 Oct 2017 11:41:20 +0200 From: Tomasz Nowicki <tnowicki@...iumnetworks.com> To: Tomasz Nowicki <tomasz.nowicki@...iumnetworks.com>, joro@...tes.org, robin.murphy@....com Cc: will.deacon@....com, Jayachandran.Nair@...ium.com, ard.biesheuvel@...aro.org, linux-kernel@...r.kernel.org, iommu@...ts.linux-foundation.org, linux-arm-kernel@...ts.infradead.org, Ganapatrao.Kulkarni@...ium.com Subject: Re: [PATCH V2 0/1] Optimise IOVA allocations for PCI devices Hi Joerg, Can you please have a look and see if you are fine with this patch? Thanks in advance, Tomasz On 20.09.2017 10:52, Tomasz Nowicki wrote: > Here is my test setup where I have stareted performance measurements. > > ------------ PCIe ------------- TX ------------- PCIe ----- > | ThunderX2 |------| Intel XL710 | ---> | Intel XL710 |------| X86 | > | (128 cpus) | | 40GbE | | 40GbE | ----- > ------------ ------------- ------------- > > As the reference lets take v4.13 host, SMMUv3 off and 1-thread iperf > taskset to one CPU. The performance results I got: > > SMMU off -> 100% > SMMU on -> 0,02% > > I followed down the DMA mapping path and found out IOVA 32-bit space > full so that kernel was flushing rcaches for all CPUs in (1). > For 128 CPUs, this kills the performance. Furthermore, for my case, rcaches > contained PFNs > 32-bit mostly so the second round of IOVA allocation failed > as well. As the consequence IOVA had to be allocated outside of 32-bit (2) > from scratch since all rcaches have been flushed in (1). > > if (dma_limit > DMA_BIT_MASK(32) && dev_is_pci(dev)) > (1)--> iova = alloc_iova_fast(iovad, iova_len, DMA_BIT_MASK(32) >> shift); > > if (!iova) > (2)--> iova = alloc_iova_fast(iovad, iova_len, dma_limit >> shift); > > My fix simply introduces parameter for alloc_iova_fast() to decide whether > rcache flush has to be done or not. All users follow mentioned scenario > so they should let flush as the last chance to avoid time costly iteration > over all CPUs. > > This bring my iperf performance back to 100% with SMMU on. > > My bad feelings regarding this solution is that machines with relatively > small numbers of CPUs may get DAC addresses more frequently for PCI > devices. Please let me know your thoughts. > > Changelog: > > v1 --> v2 > - add missing documentation > - fix typo > > Tomasz Nowicki (1): > iommu/iova: Make rcache flush optional on IOVA allocation failure > > drivers/iommu/amd_iommu.c | 5 +++-- > drivers/iommu/dma-iommu.c | 6 ++++-- > drivers/iommu/intel-iommu.c | 5 +++-- > drivers/iommu/iova.c | 11 ++++++----- > include/linux/iova.h | 5 +++-- > 5 files changed, 19 insertions(+), 13 deletions(-) >
Powered by blists - more mailing lists