lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <D12839161ADD3A4B8DA63D1A134D084026E48BA730@ESGSCCMS0001.eapac.ericsson.se>
Date:	Sun, 10 Apr 2011 15:02:53 +0800
From:	Wei Gu <wei.gu@...csson.com>
To:	Eric Dumazet <eric.dumazet@...il.com>,
	Alexander Duyck <alexander.h.duyck@...el.com>
CC:	netdev <netdev@...r.kernel.org>,
	"Kirsher, Jeffrey T" <jeffrey.t.kirsher@...el.com>
Subject: RE: Low performance Intel 10GE NIC (3.2.10) on 2.6.38 Kernel

Hi ,
I haven't enable the packet header spilting, so I think no netdev_alloc_page() will be called in my case.

BTW, I also did the same test from 2.6.33 to 2.6.35 kernel, looks the problem happens from 2.6.35, cause 2.6.32/33/34 do not see this problem at all, they all works pretty good.

I modify the .configs base on the FC13/14, and also manully set the DMAR DEFAULT to off, chose the SLAB as the memory allocator (same as RHEL6 2.6.32). For more detail about the config, please check the attached file

Thanks
WeiGu

-----Original Message-----
From: Eric Dumazet [mailto:eric.dumazet@...il.com]
Sent: Saturday, April 09, 2011 2:37 PM
To: Wei Gu
Cc: Alexander Duyck; netdev; Kirsher, Jeffrey T
Subject: RE: Low performance Intel 10GE NIC (3.2.10) on 2.6.38 Kernel

Le samedi 09 avril 2011 à 11:27 +0800, Wei Gu a écrit :
> HI Eric,
> If I try to bind the 8 tx&rx queue to different NUMA Node to (core 3,7,11,15,19,23,27,31), looks doesn't help on the rx_missing_error anymore.
>
> I still think the best performance would be binding NIC to one sock of CPU with it's local memory node.
> I did a lot of combination on 2.6.32 kernel, by bind the eth10 to NODE2/3 could gain 20% more performance compare to NODE0/1.
> So I guess the CPU Socket 2&3 was locally with the eth10.
>

Ideally, you would need to split memory loads on several nodes, because you have a workload on a single NIC, located on a given node Nx.


1) Let the buffers where NIC performs DMA be on Nx, so that DMA is fast.

2) And everything else on other nodes, so that cpus can steal some memory bandwidth from other nodes, and free Nx memory bandwidth for NIC use. (Processors only need to fetch first cache line of packets to perform routing decision)

alloc_skb() would need to use memory from node Ny for "struct sk_buff", and memory from node Nx for "skb->data" and skb frags [ netdev_alloc_page() in ixgbe case]

In your case, you have 4 nodes, so Ny would be in a set of 3 nodes.

So commit 564824b0c52c34692d804b would need a litle tweak in your case [ where your cpus need to bring only one cache line from the packet payload ]

Please try following patch :



 include/linux/skbuff.h |   14 +-------------
 net/core/skbuff.c      |   19 +++++++++++++++++++
 2 files changed, 20 insertions(+), 13 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index d0ae90a..b43626d 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1567,19 +1567,7 @@ static inline struct sk_buff *netdev_alloc_skb_ip_align(struct net_device *dev,
        return skb;
 }

-/**
- *     __netdev_alloc_page - allocate a page for ps-rx on a specific device
- *     @dev: network device to receive on
- *     @gfp_mask: alloc_pages_node mask
- *
- *     Allocate a new page. dev currently unused.
- *
- *     %NULL is returned if there is no free memory.
- */
-static inline struct page *__netdev_alloc_page(struct net_device *dev, gfp_t gfp_mask) -{
-       return alloc_pages_node(NUMA_NO_NODE, gfp_mask, 0);
-}
+extern struct page *__netdev_alloc_page(struct net_device *dev, gfp_t
+gfp_mask);

 /**
  *     netdev_alloc_page - allocate a page for ps-rx on a specific device
diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 7ebeed0..877797e 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -259,6 +259,25 @@ struct sk_buff *__netdev_alloc_skb(struct net_device *dev,  }  EXPORT_SYMBOL(__netdev_alloc_skb);

+/**
+ *     __netdev_alloc_page - allocate a page for ps-rx on a specific device
+ *     @dev: network device to receive on
+ *     @gfp_mask: alloc_pages_node mask
+ *
+ *     Allocate a new page. dev currently unused.
+ *
+ *     %NULL is returned if there is no free memory.
+ */
+struct page *__netdev_alloc_page(struct net_device *dev, gfp_t
+gfp_mask) {
+       int node = dev->dev.parent ? dev_to_node(dev->dev.parent) : NUMA_NO_NODE;
+       struct page *page;
+
+       page = alloc_pages_node(node, gfp_mask, 0);
+       return page;
+}
+EXPORT_SYMBOL(__netdev_alloc_page);
+
 void skb_add_rx_frag(struct sk_buff *skb, int i, struct page *page, int off,
                int size)
 {



Download attachment "config-2.6.34" of type "application/octet-stream" (103757 bytes)

Download attachment "config-2.6.35" of type "application/octet-stream" (107835 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ