linux-kernel - Re: [PATCH] slub: prefetch next freelist pointer in slab

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1324055915.25554.69.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>
Date:	Fri, 16 Dec 2011 18:18:35 +0100
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Christoph Lameter <cl@...ux.com>
Cc:	linux-kernel <linux-kernel@...r.kernel.org>,
	Pekka Enberg <penberg@...nel.org>,
	David Rientjes <rientjes@...gle.com>,
	"Alex,Shi" <alex.shi@...el.com>, Shaohua Li <shaohua.li@...el.com>,
	Matt Mackall <mpm@...enic.com>
Subject: Re: [PATCH] slub: prefetch next freelist pointer in slab_alloc()

Le vendredi 16 décembre 2011 à 10:31 -0600, Christoph Lameter a écrit :
> On Fri, 16 Dec 2011, Eric Dumazet wrote:
> 
> > Recycling a page is a problem, since freelist link chain is hot on
> > cpu(s) which freed objects, and possibly very cold on cpu currently
> > owning slab.
> 
> Good idea. How do the tcp benchmarks look after this?
> 
> Looks sane.
> 
> Acked-by: Christoph Lameter <cl@...ux.com>

Thanks !

I wouldnt expect TCP being a huge win (most of cpu is consumed in tcp
stack, not really memory allocations), but still...

[I expect much better gain on an UDP load, where memory allocator costs
are higher ]

$ cat netperf.sh
for in in `seq 1 32`
do
 netperf -H 192.168.20.110 -v 0 -l -100000 -t TCP_RR &
done
wait

If cpu0 handles network interrupts, and other cpus run applications :

Before

 Performance counter stats for './netperf.sh':

      38001,927957 task-clock                #    2,344 CPUs utilized          
         3 306 138 context-switches          #    0,087 M/sec                  
                79 CPU-migrations            #    0,000 M/sec                  
             9 656 page-faults               #    0,000 M/sec                  
    83 564 329 446 cycles                    #    2,199 GHz                    
    61 350 744 867 stalled-cycles-frontend   #   73,42% frontend cycles idle   
    34 907 541 687 stalled-cycles-backend    #   41,77% backend  cycles idle   
    44 739 971 752 instructions              #    0,54  insns per cycle        
                                             #    1,37  stalled cycles per insn
     8 662 005 669 branches                  #  227,936 M/sec                  
       249 555 153 branch-misses             #    2,88% of all branches        

      16,214220448 seconds time elapsed

After :

 Performance counter stats for './netperf.sh':

      37035,347847 task-clock                #    2,374 CPUs utilized          
         3 314 540 context-switches          #    0,089 M/sec                  
               131 CPU-migrations            #    0,000 M/sec                  
             9 691 page-faults               #    0,000 M/sec                  
    81 783 678 294 cycles                    #    2,208 GHz                    
    59 595 242 695 stalled-cycles-frontend   #   72,87% frontend cycles idle   
    34 367 813 304 stalled-cycles-backend    #   42,02% backend  cycles idle   
    44 698 853 546 instructions              #    0,55  insns per cycle        
                                             #    1,33  stalled cycles per insn
     8 654 940 308 branches                  #  233,694 M/sec                  
       245 578 562 branch-misses             #    2,84% of all branches        

      15,597940419 seconds time elapsed


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/