[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180327113750.33cb4d5b@redhat.com>
Date: Tue, 27 Mar 2018 11:37:50 +0200
From: Jesper Dangaard Brouer <brouer@...hat.com>
To: William Tu <u9012063@...il.com>
Cc: Björn Töpel <bjorn.topel@...il.com>,
magnus.karlsson@...el.com,
Alexander Duyck <alexander.h.duyck@...el.com>,
Alexander Duyck <alexander.duyck@...il.com>,
John Fastabend <john.fastabend@...il.com>,
Alexei Starovoitov <ast@...com>,
willemdebruijn.kernel@...il.com,
Daniel Borkmann <daniel@...earbox.net>,
Linux Kernel Network Developers <netdev@...r.kernel.org>,
Björn Töpel
<bjorn.topel@...el.com>, michael.lundkvist@...csson.com,
jesse.brandeburg@...el.com, anjali.singhai@...el.com,
jeffrey.b.shaw@...el.com, ferruh.yigit@...el.com,
qi.z.zhang@...el.com, brouer@...hat.com
Subject: Re: [RFC PATCH 00/24] Introducing AF_XDP support
On Mon, 26 Mar 2018 14:58:02 -0700
William Tu <u9012063@...il.com> wrote:
> > Again high count for NMI ?!?
> >
> > Maybe you just forgot to tell perf that you want it to decode the
> > bpf_prog correctly?
> >
> > https://prototype-kernel.readthedocs.io/en/latest/bpf/troubleshooting.html#perf-tool-symbols
> >
> > Enable via:
> > $ sysctl net/core/bpf_jit_kallsyms=1
> >
> > And use perf report (while BPF is STILL LOADED):
> >
> > $ perf report --kallsyms=/proc/kallsyms
> >
> > E.g. for emailing this you can use this command:
> >
> > $ perf report --sort cpu,comm,dso,symbol --kallsyms=/proc/kallsyms --no-children --stdio -g none | head -n 40
> >
>
> Thanks, I followed the steps, the result of l2fwd
> # Total Lost Samples: 119
> #
> # Samples: 2K of event 'cycles:ppp'
> # Event count (approx.): 25675705627
> #
> # Overhead CPU Command Shared Object Symbol
> # ........ ... ....... .................. ..................................
> #
> 10.48% 013 xdpsock xdpsock [.] main
> 9.77% 013 xdpsock [kernel.vmlinux] [k] clflush_cache_range
> 8.45% 013 xdpsock [kernel.vmlinux] [k] nmi
> 8.07% 013 xdpsock [kernel.vmlinux] [k] xsk_sendmsg
> 7.81% 013 xdpsock [kernel.vmlinux] [k] __domain_mapping
> 4.95% 013 xdpsock [kernel.vmlinux] [k] ixgbe_xmit_frame_ring
> 4.66% 013 xdpsock [kernel.vmlinux] [k] skb_store_bits
> 4.39% 013 xdpsock [kernel.vmlinux] [k] syscall_return_via_sysret
> 3.93% 013 xdpsock [kernel.vmlinux] [k] pfn_to_dma_pte
> 2.62% 013 xdpsock [kernel.vmlinux] [k] __intel_map_single
> 2.53% 013 xdpsock [kernel.vmlinux] [k] __alloc_skb
> 2.36% 013 xdpsock [kernel.vmlinux] [k] iommu_no_mapping
> 2.21% 013 xdpsock [kernel.vmlinux] [k] alloc_skb_with_frags
> 2.07% 013 xdpsock [kernel.vmlinux] [k] skb_set_owner_w
> 1.98% 013 xdpsock [kernel.vmlinux] [k] __kmalloc_node_track_caller
> 1.94% 013 xdpsock [kernel.vmlinux] [k] ksize
> 1.84% 013 xdpsock [kernel.vmlinux] [k] validate_xmit_skb_list
> 1.62% 013 xdpsock [kernel.vmlinux] [k] kmem_cache_alloc_node
> 1.48% 013 xdpsock [kernel.vmlinux] [k] __kmalloc_reserve.isra.37
> 1.21% 013 xdpsock xdpsock [.] xq_enq
> 1.08% 013 xdpsock [kernel.vmlinux] [k] intel_alloc_iova
>
You did use net/core/bpf_jit_kallsyms=1 and correct perf commands decoding of
bpf_prog, so the perf top#3 'nmi' is likely a real NMI call... which looks wrong.
> And l2fwd under "perf stat" looks OK to me. There is little context
> switches, cpu is fully utilized, 1.17 insn per cycle seems ok.
>
> Performance counter stats for 'CPU(s) 6':
> 10000.787420 cpu-clock (msec) # 1.000 CPUs utilized
> 24 context-switches # 0.002 K/sec
> 0 cpu-migrations # 0.000 K/sec
> 0 page-faults # 0.000 K/sec
> 22,361,333,647 cycles # 2.236 GHz
> 13,458,442,838 stalled-cycles-frontend # 60.19% frontend cycles idle
> 26,251,003,067 instructions # 1.17 insn per cycle
> # 0.51 stalled cycles per insn
> 4,938,921,868 branches # 493.853 M/sec
> 7,591,739 branch-misses # 0.15% of all branches
> 10.000835769 seconds time elapsed
This perf stat also indicate something is wrong.
The 1.17 insn per cycle is NOT okay, it is too low (compared to what
usually I see, e.g. 2.36 insn per cycle).
It clearly says you have 'stalled-cycles-frontend' and '60.19% frontend
cycles idle'. This means your CPU have issues/bottleneck fetching
instructions. Explained by Andi Kleen here [1]
[1] https://github.com/andikleen/pmu-tools/wiki/toplev-manual
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
Powered by blists - more mailing lists