lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <08FE5CC30C9A3F41BF819A502CF7BF6E0198249D@fmsmsx411.amr.corp.intel.com>
Date:	Fri, 6 Jul 2007 10:14:56 -0700
From:	"Williams, Mitch A" <mitch.a.williams@...el.com>
To:	"David Miller" <davem@...emloft.net>,
	<shemminger@...ux-foundation.org>
Cc:	<netdev@...r.kernel.org>
Subject: RE: [RFC 2/2] shrink size of scatterlist on common i386/x86-64

David Miller wrote:
>> Okay, but then using SG lists makes skbuff's much bigger.
>>     
>> 	fraglist	scatterlist		   per skbuff
>> 32 bit	8		20		+12 * 18 = +216!
>> 64 bit	16		32		+16 * 18 = +288
>> 
>> So never mind...
>
>I know, this is why nobody ever really tries to tackle this.
>
>> I'll do a fraglist to scatter list set of routines, but not sure
>> if it's worth it.
>
>It's better to add dma_map_skb() et al. interfaces to be honest.
>
>Also even with the scatterlist idea, we'd still need to do two
>map calls, one for skb->data and one for the page vector.

FWIW, I tried this about a year ago to try to improve e1000 performance
on pSeries.  I was hoping to simplify the driver transmit code and make
IOMMU mapping easier.  This was on 2.6.16 or thereabouts.

Net result:  zilch.  No performance increase, no noticeable CPU
utilization
benefits.  Nothing.  So I dropped it.

Slightly off topic:
The real problem that I saw on pSeries is lock contention for the IOMMU.
It's architected with a single table per slot, which is great in that
two boards in separate slots won't have lock contention.  However, this
all goes out the window when you drop a quad-port gigabit adapter in
there.
The time spent waiting for the IOMMU table lock goes up exponentially
as you activate each additional port.

In my opinion, IOMMU table locking is the major issue with this type of
architecture.  Since both Intel and AMD are touting IOMMUs for virtual-
ization support, this is an issue that's going to need a lot of
scrutiny.

-Mitch
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ