lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20090817095302.0c41ef68@jbarnes-g45>
Date:	Mon, 17 Aug 2009 09:53:02 -0700
From:	Jesse Barnes <jbarnes@...tuousgeek.org>
To:	Bill Fink <billfink@...dspring.com>
Cc:	"Brandeburg, Jesse" <jesse.brandeburg@...el.com>,
	Neil Horman <nhorman@...driver.com>,
	Andrew Gallatin <gallatin@...i.com>,
	Brice Goglin <Brice.Goglin@...ia.fr>,
	Linux Network Developers <netdev@...r.kernel.org>,
	Yinghai Lu <yhlu.kernel@...il.com>
Subject: Re: Receive side performance issue with multi-10-GigE and NUMA

On Fri, 14 Aug 2009 16:31:55 -0400
Bill Fink <billfink@...dspring.com> wrote:

> On Wed, 12 Aug 2009, Bill Fink wrote:
> 
> > On Tue, 11 Aug 2009, Brandeburg, Jesse wrote:
> > 
> > > Bill Fink wrote:
> > > > On Sat, 8 Aug 2009, Neil Horman wrote:
> > > > 
> > > >> On Sat, Aug 08, 2009 at 02:21:36PM -0400, Andrew Gallatin
> > > >> wrote:
> > > >>> Neil Horman wrote:
> > > >>>> On Sat, Aug 08, 2009 at 07:08:20AM -0400, Andrew Gallatin
> > > >>>> wrote:
> > > >>>>> Bill Fink wrote:
> > > >>>>>> On Fri, 07 Aug 2009, Andrew Gallatin wrote:
> > > >>>>>> 
> > > >>>>>>> Bill Fink wrote:
> > > >>>>>>> 
> > > >>>>>>>> All sysfs local_cpus values are the same
> > > >>>>>>>> (00000000,000000ff), so yes they are also wrong.
> > > 
> > > bill, I recently helped Jesse Barnes push a patch that addresses
> > > this kind of issue on CoreI7, the root cause was the numa_node
> > > variable was initialized based on slot on AMD systems, but needed
> > > to be set to -1 by default on systems with a uniform IOH to slot
> > > architecture.
> > > 
> > > here is the commit ID:
> > > http://git.kernel.org/?p=linux/kernel/git/sfr/linux-next.git;a=commit;h=3c38
> > > d674be519109696746192943a6d524019f7f
> > > 
> > > I'm not sure it is in linus' tree yet, this link is to net-next
> > > 
> > > Maybe see if it helps?
> > 
> > It's worth a shot.
> > 
> > Hopefully I can get a chance to build a new kernel tomorrow to check
> > out some of the suggestions, like this one, the setting of
> > ACPI_DEBUG, and the new ftrace module for checking NUMA affinity of
> > skbs.
> 
> I applied this patch to my 2.6.29.6 kernel (from Fedora 11).
> 
> Now when I do:
> 
> 	find /sys -name numa_node -exec grep . {} /dev/null \;
> 
> the numa_node for _all_ PCI devices is -1.

Yeah, that sounds right (indicates they're not really tied to a
specific node).

> When I do:
> 
> 	find /sys -name local_cpus -exec grep . {} /dev/null \;
> 
> I find that local_cpus is always 00000000,00000000.
> 
> Is that OK or should it be 00000000,000000ff (for my dual quad-core
> Xeon 5580 system with no hyperthreading)?

Hm, yeah it probably should have the full CPU mask...

> Also, is it just not possible on this type of Intel Xeon system to
> properly associate the PCI devices with the nearest NUMA node?

All the PCI devices hang off the root complex, which is the same
distance to each node of memory (at least that's my understanding for
current platforms).

> In any event, the patch didn't help (or hurt).  The transmit
> performance remained at ~100 Gbps while the receive performance
> remained at 55 Gbps.

Maybe the other Jesse has some ideas here.

-- 
Jesse Barnes, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ