linux-kernel - 2.6.24-rc2: Network commit causes SLUB performance regression with tbench

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Pine.LNX.4.64.0711091533300.17621@schroedinger.engr.sgi.com>
Date:	Fri, 9 Nov 2007 15:46:20 -0800 (PST)
From:	Christoph Lameter <clameter@....com>
To:	"David S. Miller" <davem@...emloft.net>,
	Herbert Xu <herbert@...dor.apana.org.au>
cc:	Nick Piggin <nickpiggin@...oo.com.au>, linux-kernel@...r.kernel.org
Subject: 2.6.24-rc2: Network commit causes SLUB performance regression with
 tbench

commit deea84b0ae3d26b41502ae0a39fe7fe134e703d0 seems to cause a drop
in SLUB tbench performance:

8p x86_64 system:

2.6.24-rc2:
	1260.80 MB/sec

After reverting the patch:
	2350.04 MB/sec

SLAB performance (which is at 2435.58 MB/sec, ~3% better than SLUB) is not 
affected by the patch.

Since this is an alignment change it seems that tbench performance is 
sensitive to the data layout? SLUB packs data more tightly than SLAB. So 
8 byte allocations could result in cacheline contention if adjacent 
objects are allocated from different cpus. SLABs minimum size is 32 
bytes so the cacheline contention is likely more limited.

Maybe we need to allocate a mininum of one cacheline to the skb head? Or 
padd it out to a full cacheline?




commit deea84b0ae3d26b41502ae0a39fe7fe134e703d0
Author: Herbert Xu <herbert@...dor.apana.org.au>
Date:   Sun Oct 21 16:27:46 2007 -0700

    [NET]: Fix SKB_WITH_OVERHEAD calculation

    The calculation in SKB_WITH_OVERHEAD is incorrect in that it can cause
    an overflow across a page boundary which is what it's meant to prevent.
    In particular, the header length (X) should not be lumped together with
    skb_shared_info.  The latter needs to be aligned properly while the header
    has no choice but to sit in front of wherever the payload is.

    Therefore the correct calculation is to take away the aligned size of
    skb_shared_info, and then subtract the header length.  The resulting
    quantity L satisfies the following inequality:

        SKB_DATA_ALIGN(L + X) + sizeof(struct skb_shared_info) <= PAGE_SIZE

    This is the quantity used by alloc_skb to do the actual allocation.

    Signed-off-by: Herbert Xu <herbert@...dor.apana.org.au>
    Signed-off-by: David S. Miller <davem@...emloft.net>

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index f93f22b..369f60a 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -41,8 +41,7 @@
 #define SKB_DATA_ALIGN(X)      (((X) + (SMP_CACHE_BYTES - 1)) & \
                                 ~(SMP_CACHE_BYTES - 1))
 #define SKB_WITH_OVERHEAD(X)   \
-       (((X) - sizeof(struct skb_shared_info)) & \
-        ~(SMP_CACHE_BYTES - 1))
+       ((X) - SKB_DATA_ALIGN(sizeof(struct skb_shared_info)))
 #define SKB_MAX_ORDER(X, ORDER) \
        SKB_WITH_OVERHEAD((PAGE_SIZE << (ORDER)) - (X))
 #define SKB_MAX_HEAD(X)                (SKB_MAX_ORDER((X), 0))

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/