netdev - Re: [RFC 2/2] shrink size of scatterlist on common i386/x86-64

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20070708161154.GZ4146@rhun.haifa.ibm.com>
Date:	Sun, 8 Jul 2007 19:11:54 +0300
From:	Muli Ben-Yehuda <muli@...ibm.com>
To:	"Williams, Mitch A" <mitch.a.williams@...el.com>
Cc:	David Miller <davem@...emloft.net>,
	shemminger@...ux-foundation.org, netdev@...r.kernel.org
Subject: Re: [RFC 2/2] shrink size of scatterlist on common i386/x86-64

On Fri, Jul 06, 2007 at 10:14:56AM -0700, Williams, Mitch A wrote:
> David Miller wrote:
> >> Okay, but then using SG lists makes skbuff's much bigger.
> >>     
> >> 	fraglist	scatterlist		   per skbuff
> >> 32 bit	8		20		+12 * 18 = +216!
> >> 64 bit	16		32		+16 * 18 = +288
> >> 
> >> So never mind...
> >
> >I know, this is why nobody ever really tries to tackle this.
> >
> >> I'll do a fraglist to scatter list set of routines, but not sure
> >> if it's worth it.
> >
> >It's better to add dma_map_skb() et al. interfaces to be honest.
> >
> >Also even with the scatterlist idea, we'd still need to do two
> >map calls, one for skb->data and one for the page vector.
> 
> FWIW, I tried this about a year ago to try to improve e1000
> performance on pSeries.  I was hoping to simplify the driver
> transmit code and make IOMMU mapping easier.  This was on 2.6.16 or
> thereabouts.
> 
> Net result:  zilch.  No performance increase, no noticeable CPU
> utilization
> benefits.  Nothing.  So I dropped it.

Do you have pointers to the patches perchance?

> Slightly off topic:
> The real problem that I saw on pSeries is lock contention for the IOMMU.
> It's architected with a single table per slot, which is great in that
> two boards in separate slots won't have lock contention.  However, this
> all goes out the window when you drop a quad-port gigabit adapter in
> there.
> The time spent waiting for the IOMMU table lock goes up exponentially
> as you activate each additional port.
> 
> In my opinion, IOMMU table locking is the major issue with this type
> of architecture.  Since both Intel and AMD are touting IOMMUs for
> virtual- ization support, this is an issue that's going to need a
> lot of scrutiny.

Agreed.

Cheers,
Muli
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html