netdev - [Bug Report] use bonding lacp mode aggregation NIC has performance problems while intel

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Date:	Fri, 28 Jun 2013 16:59:14 +0800
From:	wangyufen <wangyufen@...wei.com>
To:	<davem@...emloft.net>
CC:	<joro@...tes.org>, <Varun.Sethi@...escale.com>, <aik@...abs.ru>,
	<alex.williamson@...hat.com>, <netdev@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>, <zhangdianfang@...wei.com>,
	Li Zefan <lizefan@...wei.com>
Subject: [Bug Report] use bonding lacp mode aggregation NIC has performance
 problems while  intel_immu switch opening

           Summary: use bonding lacp mode aggregation NIC has performance problems while  intel_immu switch opening
           Product: Networking
           Version: 
    Kernel Version: 3.10.0-rc5
          Platform: X86
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: bonding&ixgbe
        AssignedTo: 
        ReportedBy: wangyufen@...wei.com,wangweidong1@...wei.com
        Regression: No


Hi I'am using bonding lacp mode aggregation intel 82599 NIC, there are  performance problems while intel_immu switch is open.
I'am using iperf to test the network speeds.
When intel_immu switch is closed, 4 port after polymerization, the network port speeds up to 37.6 Gbits / sec,
when intel_immu switch is turned on, 4port After polymerization, the network port speed only to 28.7 Gbits / sec.


intel_iommu=off
dma_ops = &swiotlb_dma_ops,so map_page = swiotlb_map_page and unmap_page = swiotlb_unmap_page
intel_iommu=on
dma_ops = &intel_dma_ops,so map_page=intel_map_page and unmap_page=intel_unmap_page

I think the intel_dma_ops will cost more performance, and the dma_map_page and
dma_map_single_attrs will call map_page(),but the paramters is not same. Therefor, I do a test
about the time of funcs call map_page or unmap_page.

----------------------------------------------------------------------------
func\count				350(*10000)		
dma_map_single_attrs,iommu-off		640000(ns)~1000000(ns)		
dma_map_single_attrs,iommu-on		4900000(ns)~5700000(ns)
dma_unmap_single_attrs,iommu-off	330000(ns)~620000(ns)
dma_unmap_single_attrs,iommu-on		3000000(ns)~47000000(ns)

func\count				2900(*10000)
dma_map_page,iommu-off			350000(ns)~610000(ns)
dma_map_page,iommu-on			2160000(ns)~3000000(ns)
dma_unmap_page,iommu-off		345000(ns)~670000(ns)
dma_unmap_page,iommu-on			3000000(ns)~4300000(ns)
----------------------------------------------------------------------------
the time that map and unmap function cost show huge gap when iommu is on and off.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html