[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <YOgK8fdv7dOQtkET@T590>
Date: Fri, 9 Jul 2021 16:38:09 +0800
From: Ming Lei <ming.lei@...hat.com>
To: linux-nvme@...ts.infradead.org, Will Deacon <will@...nel.org>,
linux-arm-kernel@...ts.infradead.org,
iommu@...ts.linux-foundation.org
Cc: linux-kernel@...r.kernel.org
Subject: [bug report] iommu_dma_unmap_sg() is very slow then running IO from
remote numa node
Hello,
I observed that NVMe performance is very bad when running fio on one
CPU(aarch64) in remote numa node compared with the nvme pci numa node.
Please see the test result[1] 327K vs. 34.9K.
Latency trace shows that one big difference is in iommu_dma_unmap_sg(),
1111 nsecs vs 25437 nsecs.
[1] fio test & results
1) fio test result:
- run fio on local CPU
taskset -c 0 ~/git/tools/test/nvme/io_uring 10 1 /dev/nvme1n1 4k
+ fio --bs=4k --ioengine=io_uring --fixedbufs --registerfiles --hipri --iodepth=64 --iodepth_batch_submit=16 --iodepth_batch_complete_min=16 --filename=/dev/nvme1n1 --direct=1 --runtime=10 --numjobs=1 --rw=randread --name=test --group_reporting
IOPS: 327K
avg latency of iommu_dma_unmap_sg(): 1111 nsecs
- run fio on remote CPU
taskset -c 80 ~/git/tools/test/nvme/io_uring 10 1 /dev/nvme1n1 4k
+ fio --bs=4k --ioengine=io_uring --fixedbufs --registerfiles --hipri --iodepth=64 --iodepth_batch_submit=16 --iodepth_batch_complete_min=16 --filename=/dev/nvme1n1 --direct=1 --runtime=10 --numjobs=1 --rw=randread --name=test --group_reporting
IOPS: 34.9K
avg latency of iommu_dma_unmap_sg(): 25437 nsecs
2) system info
[root@...ere-mtjade-04 ~]# lscpu | grep NUMA
NUMA node(s): 2
NUMA node0 CPU(s): 0-79
NUMA node1 CPU(s): 80-159
lspci | grep NVMe
0003:01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983
[root@...ere-mtjade-04 ~]# cat /sys/block/nvme1n1/device/device/numa_node
0
Thanks,
Ming
Powered by blists - more mailing lists