[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ad5bc549-d83f-bee0-9a9f-03a5afd7f3d9@huawei.com>
Date: Mon, 19 Jul 2021 17:14:28 +0100
From: John Garry <john.garry@...wei.com>
To: Ming Lei <ming.lei@...hat.com>, Robin Murphy <robin.murphy@....com>
CC: <iommu@...ts.linux-foundation.org>, Will Deacon <will@...nel.org>,
<linux-arm-kernel@...ts.infradead.org>,
<linux-nvme@...ts.infradead.org>, <linux-kernel@...r.kernel.org>
Subject: Re: [bug report] iommu_dma_unmap_sg() is very slow then running IO
from remote numa node
On 09/07/2021 15:24, Ming Lei wrote:
>> associated compromises.
> Follows the log of 'perf report'
>
> 1) good(run fio from cpus in the nvme's numa node)
Hi Ming,
If you're still interested in this issue, as an experiment only you can
try my rebased patches here:
https://github.com/hisilicon/kernel-dev/commits/private-topic-smmu-5.14-cmdq-4
I think that you should see a significant performance boost.
Thanks
John
>
> - 34.86% 1.73% fio [nvme] [k] nvme_process_cq ▒
> - 33.13% nvme_process_cq ▒
> - 32.93% nvme_pci_complete_rq ▒
> - 24.92% nvme_unmap_data ▒
> - 20.08% dma_unmap_sg_attrs ▒
> - 19.79% iommu_dma_unmap_sg ▒
> - 19.55% __iommu_dma_unmap ▒
> - 16.86% arm_smmu_iotlb_sync ▒
> - 16.81% arm_smmu_tlb_inv_range_domain ▒
> - 14.73% __arm_smmu_tlb_inv_range ▒
> 14.44% arm_smmu_cmdq_issue_cmdlist ▒
> 0.89% __pi_memset ▒
> 0.75% arm_smmu_atc_inv_domain ▒
> + 1.58% iommu_unmap_fast ▒
> + 0.71% iommu_dma_free_iova ▒
> - 3.25% dma_unmap_page_attrs ▒
> - 3.21% iommu_dma_unmap_page ▒
> - 3.14% __iommu_dma_unmap_swiotlb ▒
> - 2.86% __iommu_dma_unmap ▒
> - 2.48% arm_smmu_iotlb_sync ▒
> - 2.47% arm_smmu_tlb_inv_range_domain ▒
> - 2.19% __arm_smmu_tlb_inv_range ▒
> 2.16% arm_smmu_cmdq_issue_cmdlist ▒
> + 1.34% mempool_free ▒
> + 7.68% nvme_complete_rq ▒
> + 1.73% _start
>
>
> 2) bad(run fio from cpus not in the nvme's numa node)
> - 49.25% 3.03% fio [nvme] [k] nvme_process_cq ▒
> - 46.22% nvme_process_cq ▒
> - 46.07% nvme_pci_complete_rq ▒
> - 41.02% nvme_unmap_data ▒
> - 34.92% dma_unmap_sg_attrs ▒
> - 34.75% iommu_dma_unmap_sg ▒
> - 34.58% __iommu_dma_unmap ▒
> - 33.04% arm_smmu_iotlb_sync ▒
> - 33.00% arm_smmu_tlb_inv_range_domain ▒
> - 31.86% __arm_smmu_tlb_inv_range ▒
> 31.71% arm_smmu_cmdq_issue_cmdlist ▒
> + 0.90% iommu_unmap_fast ▒
> - 5.17% dma_unmap_page_attrs ▒
> - 5.15% iommu_dma_unmap_page ▒
> - 5.12% __iommu_dma_unmap_swiotlb ▒
> - 5.05% __iommu_dma_unmap ▒
> - 4.86% arm_smmu_iotlb_sync ▒
> - 4.85% arm_smmu_tlb_inv_range_domain ▒
> - 4.70% __arm_smmu_tlb_inv_range ▒
> 4.67% arm_smmu_cmdq_issue_cmdlist ▒
> + 0.74% mempool_free ▒
> + 4.83% nvme_complete_rq ▒
> + 3.03% _start
Powered by blists - more mailing lists