[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <1490164067-12552-1-git-send-email-thunder.leizhen@huawei.com>
Date: Wed, 22 Mar 2017 14:27:40 +0800
From: Zhen Lei <thunder.leizhen@...wei.com>
To: Joerg Roedel <joro@...tes.org>,
iommu <iommu@...ts.linux-foundation.org>,
Robin Murphy <robin.murphy@....com>,
David Woodhouse <dwmw2@...radead.org>,
Sudeep Dutt <sudeep.dutt@...el.com>,
Ashutosh Dixit <ashutosh.dixit@...el.com>,
linux-kernel <linux-kernel@...r.kernel.org>
CC: Zefan Li <lizefan@...wei.com>, Xinwei Hu <huxinwei@...wei.com>,
"Tianhong Ding" <dingtianhong@...wei.com>,
Hanjun Guo <guohanjun@...wei.com>,
Zhen Lei <thunder.leizhen@...wei.com>
Subject: [PATCH 0/7] iommu/iova: improve the allocation performance of dma64
64 bits devices is very common now. But currently we only defined a cached32_node
to optimize the allocation performance of dma32, and I saw some dma64 drivers chose
to allocate iova from dma32 space first, maybe becuase of current dma64 performance
problem or some other reasons.
For example:(in drivers/iommu/amd_iommu.c)
static unsigned long dma_ops_alloc_iova(......
{
......
if (dma_mask > DMA_BIT_MASK(32))
pfn = alloc_iova_fast(&dma_dom->iovad, pages,
IOVA_PFN(DMA_BIT_MASK(32)));
if (!pfn)
pfn = alloc_iova_fast(&dma_dom->iovad, pages, IOVA_PFN(dma_mask));
For the details of why dma64 iova allocation performance is very bad, please refer the
description of patch-5.
In this patch series, I added a cached64_node to manage the dma64 iova space(iova>=4G), it
takes the same effect as cached32_node(iova<4G).
Below it's the performance data before and after my patch series:
(before)$ iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[ 4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35898
[ ID] Interval Transfer Bandwidth
[ 4] 0.0-10.2 sec 7.88 MBytes 6.48 Mbits/sec
[ 5] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35900
[ 5] 0.0-10.3 sec 7.88 MBytes 6.43 Mbits/sec
[ 4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 35902
[ 4] 0.0-10.3 sec 7.88 MBytes 6.43 Mbits/sec
(after)$ iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[ 4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36330
[ ID] Interval Transfer Bandwidth
[ 4] 0.0-10.0 sec 1.09 GBytes 933 Mbits/sec
[ 5] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36332
[ 5] 0.0-10.0 sec 1.10 GBytes 939 Mbits/sec
[ 4] local 192.168.1.106 port 5001 connected with 192.168.1.198 port 36334
[ 4] 0.0-10.0 sec 1.10 GBytes 938 Mbits/sec
Zhen Lei (7):
iommu/iova: fix incorrect variable types
iommu/iova: cut down judgement times
iommu/iova: insert start_pfn boundary of dma32
iommu/iova: adjust __cached_rbnode_insert_update
iommu/iova: to optimize the allocation performance of dma64
iommu/iova: move the caculation of pad mask out of loop
iommu/iova: fix iovad->dma_32bit_pfn as the last pfn of dma32
drivers/iommu/amd_iommu.c | 7 +-
drivers/iommu/dma-iommu.c | 22 ++----
drivers/iommu/intel-iommu.c | 11 +--
drivers/iommu/iova.c | 143 +++++++++++++++++++++------------------
drivers/misc/mic/scif/scif_rma.c | 3 +-
include/linux/iova.h | 7 +-
6 files changed, 94 insertions(+), 99 deletions(-)
--
2.5.0
Powered by blists - more mailing lists