[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALDO+SZcxks4xF-YZEJe3dL2sp9wR7kWYCnAnokhr-y3f9-AeQ@mail.gmail.com>
Date: Mon, 26 Mar 2018 14:58:02 -0700
From: William Tu <u9012063@...il.com>
To: Jesper Dangaard Brouer <brouer@...hat.com>
Cc: Björn Töpel <bjorn.topel@...il.com>,
magnus.karlsson@...el.com,
Alexander Duyck <alexander.h.duyck@...el.com>,
Alexander Duyck <alexander.duyck@...il.com>,
John Fastabend <john.fastabend@...il.com>,
Alexei Starovoitov <ast@...com>,
willemdebruijn.kernel@...il.com,
Daniel Borkmann <daniel@...earbox.net>,
Linux Kernel Network Developers <netdev@...r.kernel.org>,
Björn Töpel <bjorn.topel@...el.com>,
michael.lundkvist@...csson.com, jesse.brandeburg@...el.com,
anjali.singhai@...el.com, jeffrey.b.shaw@...el.com,
ferruh.yigit@...el.com, qi.z.zhang@...el.com
Subject: Re: [RFC PATCH 00/24] Introducing AF_XDP support
Hi Jesper,
Thanks a lot for your prompt reply.
>> Hi,
>> I also did an evaluation of AF_XDP, however the performance isn't as
>> good as above.
>> I'd like to share the result and see if there are some tuning suggestions.
>>
>> System:
>> 16 core, Intel(R) Xeon(R) CPU E5-2440 v2 @ 1.90GHz
>> Intel 10G X540-AT2 ---> so I can only run XDP_SKB mode
>
> Hmmm, why is X540-AT2 not able to use XDP natively?
Because I'm only able to use ixgbe driver for this NIC,
and AF_XDP patch only has i40e support?
>
>> AF_XDP performance:
>> Benchmark XDP_SKB
>> rxdrop 1.27 Mpps
>> txpush 0.99 Mpps
>> l2fwd 0.85 Mpps
>
> Definitely too low...
>
I did another run, the rxdrop seems better.
Benchmark XDP_SKB
rxdrop 2.3 Mpps
txpush 1.05 Mpps
l2fwd 0.90 Mpps
> What is the performance if you drop packets via iptables?
>
> Command:
> $ iptables -t raw -I PREROUTING -p udp --dport 9 --j DROP
>
I did
# iptables -t raw -I PREROUTING -p udp -i enp10s0f0 -j DROP
# iptables -nvL -t raw; sleep 10; iptables -nvL -t raw
and I got 2.9Mpps.
>> NIC configuration:
>> the command
>> "ethtool -N p3p2 flow-type udp4 src-port 4242 dst-port 4242 action 16"
>> doesn't work on my ixgbe driver, so I use ntuple:
>>
>> ethtool -K enp10s0f0 ntuple on
>> ethtool -U enp10s0f0 flow-type udp4 src-ip 10.1.1.100 action 1
>> then
>> echo 1 > /proc/sys/net/core/bpf_jit_enable
>> ./xdpsock -i enp10s0f0 -r -S --queue=1
>>
>> I also take a look at perf result:
>> For rxdrop:
>> 86.56% xdpsock xdpsock [.] main
>> 9.22% xdpsock [kernel.vmlinux] [k] nmi
>> 4.23% xdpsock xdpsock [.] xq_enq
>
> It looks very strange that you see non-maskable interrupt's (NMI) being
> this high...
>
yes, that's weird. Looking at the perf annotate of nmi,
it shows 100% spent on nop instruction.
>
>> For l2fwd:
>> 20.81% xdpsock xdpsock [.] main
>> 10.64% xdpsock [kernel.vmlinux] [k] clflush_cache_range
>
> Oh, clflush_cache_range is being called!
I though clflush_cache_range is high because we have many smp_rmb, smp_wmb
in the xdpsock queue/ring management userspace code.
(perf shows that 75% of this 10.64% spent on mfence instruction.)
> Do your system use an IOMMU ?
>
Yes.
With CONFIG_INTEL_IOMMU=y
and I saw some related functions called (ex: intel_alloc_iova).
>> 8.46% xdpsock [kernel.vmlinux] [k] xsk_sendmsg
>> 6.72% xdpsock [kernel.vmlinux] [k] skb_set_owner_w
>> 5.89% xdpsock [kernel.vmlinux] [k] __domain_mapping
>> 5.74% xdpsock [kernel.vmlinux] [k] alloc_skb_with_frags
>> 4.62% xdpsock [kernel.vmlinux] [k] netif_skb_features
>> 3.96% xdpsock [kernel.vmlinux] [k] ___slab_alloc
>> 3.18% xdpsock [kernel.vmlinux] [k] nmi
>
> Again high count for NMI ?!?
>
> Maybe you just forgot to tell perf that you want it to decode the
> bpf_prog correctly?
>
> https://prototype-kernel.readthedocs.io/en/latest/bpf/troubleshooting.html#perf-tool-symbols
>
> Enable via:
> $ sysctl net/core/bpf_jit_kallsyms=1
>
> And use perf report (while BPF is STILL LOADED):
>
> $ perf report --kallsyms=/proc/kallsyms
>
> E.g. for emailing this you can use this command:
>
> $ perf report --sort cpu,comm,dso,symbol --kallsyms=/proc/kallsyms --no-children --stdio -g none | head -n 40
>
Thanks, I followed the steps, the result of l2fwd
# Total Lost Samples: 119
#
# Samples: 2K of event 'cycles:ppp'
# Event count (approx.): 25675705627
#
# Overhead CPU Command Shared Object Symbol
# ........ ... ....... .................. ..................................
#
10.48% 013 xdpsock xdpsock [.] main
9.77% 013 xdpsock [kernel.vmlinux] [k] clflush_cache_range
8.45% 013 xdpsock [kernel.vmlinux] [k] nmi
8.07% 013 xdpsock [kernel.vmlinux] [k] xsk_sendmsg
7.81% 013 xdpsock [kernel.vmlinux] [k] __domain_mapping
4.95% 013 xdpsock [kernel.vmlinux] [k] ixgbe_xmit_frame_ring
4.66% 013 xdpsock [kernel.vmlinux] [k] skb_store_bits
4.39% 013 xdpsock [kernel.vmlinux] [k] syscall_return_via_sysret
3.93% 013 xdpsock [kernel.vmlinux] [k] pfn_to_dma_pte
2.62% 013 xdpsock [kernel.vmlinux] [k] __intel_map_single
2.53% 013 xdpsock [kernel.vmlinux] [k] __alloc_skb
2.36% 013 xdpsock [kernel.vmlinux] [k] iommu_no_mapping
2.21% 013 xdpsock [kernel.vmlinux] [k] alloc_skb_with_frags
2.07% 013 xdpsock [kernel.vmlinux] [k] skb_set_owner_w
1.98% 013 xdpsock [kernel.vmlinux] [k] __kmalloc_node_track_caller
1.94% 013 xdpsock [kernel.vmlinux] [k] ksize
1.84% 013 xdpsock [kernel.vmlinux] [k] validate_xmit_skb_list
1.62% 013 xdpsock [kernel.vmlinux] [k] kmem_cache_alloc_node
1.48% 013 xdpsock [kernel.vmlinux] [k] __kmalloc_reserve.isra.37
1.21% 013 xdpsock xdpsock [.] xq_enq
1.08% 013 xdpsock [kernel.vmlinux] [k] intel_alloc_iova
And l2fwd under "perf stat" looks OK to me. There is little context
switches, cpu
is fully utilized, 1.17 insn per cycle seems ok.
Performance counter stats for 'CPU(s) 6':
10000.787420 cpu-clock (msec) # 1.000 CPUs
utilized
24 context-switches # 0.002 K/sec
0 cpu-migrations # 0.000 K/sec
0 page-faults # 0.000 K/sec
22,361,333,647 cycles # 2.236 GHz
13,458,442,838 stalled-cycles-frontend # 60.19% frontend
cycles idle
26,251,003,067 instructions # 1.17 insn per
cycle
# 0.51 stalled
cycles per insn
4,938,921,868 branches # 493.853 M/sec
7,591,739 branch-misses # 0.15% of all
branches
10.000835769 seconds time elapsed
Will continue investigate...
Thanks
William
Powered by blists - more mailing lists