lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110330191511.GS18712@sequoia.sous-sol.org>
Date:	Wed, 30 Mar 2011 12:15:11 -0700
From:	Chris Wright <chrisw@...s-sol.org>
To:	Mike Travis <travis@....com>
Cc:	Chris Wright <chrisw@...s-sol.org>,
	David Woodhouse <dwmw2@...radead.org>,
	Jesse Barnes <jbarnes@...tuousgeek.org>,
	linux-pci@...r.kernel.org, iommu@...ts.linux-foundation.org,
	Mike Habeck <habeck@....com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/4] Intel pci: Remove Host Bridge devices from identity
 mapping

* Mike Travis (travis@....com) wrote:
> Chris Wright wrote:
> >* Mike Travis (travis@....com) wrote:
> >>    When the IOMMU is being used, each request for a DMA mapping requires
> >>    the intel_iommu code to look for some space in the DMA mapping table.
> >>    For most drivers this occurs for each transfer.
> >>
> >>    When there are many outstanding DMA mappings [as seems to be the case
> >>    with the 10GigE driver], the table grows large and the search for
> >>    space becomes increasingly time consuming.  Performance for the
> >>    10GigE driver drops to about 10% of it's capacity on a UV system
> >>    when the CPU count is large.
> >
> >That's pretty poor.  I've seen large overheads, but when that big it was
> >also related to issues in the 10G driver.  Do you have profile data
> >showing this as the hotspot?
> 
> Here's one from our internal bug report:
> 
> Here is a profile from a run with iommu=on  iommu=pt  (no forcedac)

OK, I was actually interested in the !pt case.  But this is useful
still.  The iova lookup being distinct from the identity_mapping() case.

> uv48-sys was receiving and uv-debug sending.
> ksoftirqd/640 was running at approx. 100% cpu utilization.
> I had pinned the nttcp process on uv48-sys to cpu 64.
> 
> # Samples: 1255641
> #
> # Overhead        Command  Shared Object  Symbol
> # ........  .............  .............  ......
> #
>    50.27%ESC[m  ksoftirqd/640  [kernel]       [k] _spin_lock
>    27.43%ESC[m  ksoftirqd/640  [kernel]       [k] iommu_no_mapping

> ...
>      0.48%  ksoftirqd/640  [kernel]       [k] iommu_should_identity_map
>      0.45%  ksoftirqd/640  [kernel]       [k] ixgbe_alloc_rx_buffers    [
> ixgbe]

Note, ixgbe has had rx dma mapping issues (that's why I wondered what
was causing the massive slowdown under !pt mode).

<snip>
> I tracked this time down to identity_mapping() in this loop:
> 
>       list_for_each_entry(info, &si_domain->devices, link)
>               if (info->dev == pdev)
>                       return 1;
> 
> I didn't get the exact count, but there was approx 11,000 PCI devices
> on this system.  And this function was called for every page request
> in each DMA request.

Right, so this is the list traversal (and wow, a lot of PCI devices).
Did you try a smarter data structure? (While there's room for another
bit in pci_dev, the bit is more about iommu implementation details than
anything at the pci level).

Or the domain_dev_info is cached in the archdata of device struct.
You should be able to just reference that directly.

Didn't think it through completely, but perhaps something as simple as:

	return pdev->dev.archdata.iommu == si_domain;

thanks,
-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ