linux-kernel - Re: [PATCH] mm: page_alloc: High-order per-cpu page allocator v7

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1481141424.4930.71.camel@edumazet-glaptop3.roam.corp.google.com>
Date:   Wed, 07 Dec 2016 12:10:24 -0800
From:   Eric Dumazet <eric.dumazet@...il.com>
To:     Mel Gorman <mgorman@...hsingularity.net>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Christoph Lameter <cl@...ux.com>,
        Michal Hocko <mhocko@...e.com>,
        Vlastimil Babka <vbabka@...e.cz>,
        Johannes Weiner <hannes@...xchg.org>,
        Jesper Dangaard Brouer <brouer@...hat.com>,
        Joonsoo Kim <iamjoonsoo.kim@....com>,
        Linux-MM <linux-mm@...ck.org>,
        Linux-Kernel <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] mm: page_alloc: High-order per-cpu page allocator v7

On Wed, 2016-12-07 at 19:48 +0000, Mel Gorman wrote:
>  
> 
> Interesting because it didn't match what I previous measured but then
> again, when I established that netperf on localhost was slab intensive,
> it was also an older kernel. Can you tell me if SLAB or SLUB was enabled
> in your test kernel?
> 
> Either that or the baseline I used has since been changed from what you
> are testing and we're not hitting the same paths.


lpaa6:~# uname -a
Linux lpaa6 4.9.0-smp-DEV #429 SMP @1481125332 x86_64 GNU/Linux

lpaa6:~# perf record -g ./netperf -t UDP_STREAM -l 3 -- -m 16384
MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
localhost () port 0 AF_INET
Socket  Message  Elapsed      Messages                
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec

212992   16384   3.00       654644      0    28601.04
212992           3.00       654592           28598.77

[ perf record: Woken up 5 times to write data ]
[ perf record: Captured and wrote 1.888 MB perf.data (~82481 samples) ]


perf report --stdio
...
     1.92%  netperf  [kernel.kallsyms]  [k]
cache_alloc_refill                 
            |
            --- cache_alloc_refill
               |          
               |--82.22%-- kmem_cache_alloc_node_trace
               |          __kmalloc_node_track_caller
               |          __alloc_skb
               |          alloc_skb_with_frags
               |          sock_alloc_send_pskb
               |          sock_alloc_send_skb
               |          __ip_append_data.isra.50
               |          ip_make_skb
               |          udp_sendmsg
               |          inet_sendmsg
               |          sock_sendmsg
               |          SYSC_sendto
               |          sys_sendto
               |          entry_SYSCALL_64_fastpath
               |          __sendto_nocancel
               |          |          
               |           --100.00%-- 0x0
               |          
           

Oh wait, sock_alloc_send_skb() requests for all the bytes in skb->head :

struct sk_buff *sock_alloc_send_skb(struct sock *sk, unsigned long size,
                                    int noblock, int *errcode)
{
        return sock_alloc_send_pskb(sk, size, 0, noblock, errcode, 0);
}


Maybe one day we will avoid doing order-4 (or even order-5 in extreme
cases !) allocations for loopback as we did for af_unix :P

I mean, maybe some applications are sending 64KB UDP messages over
loopback right now...