lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAD56B7dwKDKnrCjpGmrnxz2P0QpNWU3CGBvOtqg3RBx3ejPh9g@mail.gmail.com>
Date:   Tue, 12 Nov 2019 17:42:42 -0500
From:   Paul Thomas <pthomas8589@...il.com>
To:     Alexei Starovoitov <ast@...nel.org>,
        Daniel Borkmann <daniel@...earbox.net>,
        David Miller <davem@...emloft.net>,
        Jakub Kicinski <jakub.kicinski@...ronome.com>,
        Jesper Dangaard Brouer <hawk@...nel.org>,
        John Fastabend <john.fastabend@...il.com>,
        netdev@...r.kernel.org, xdp-newbies@...r.kernel.org,
        bpf@...r.kernel.org,
        linux-rt-users <linux-rt-users@...r.kernel.org>
Subject: xdpsock poll with 5.2.21rt kernel

Hello,

I'm doing some testing with AF_XDP, and I'm seeing some behavior I
don't quite understand. It seems I can get into a situation where
xdpsock (from samples/bpf/scpsocke_user.c) is using most of the cpu
even though I'm trying to use poll().

To start with I run xdpsock with --rxdrop and --poll. At first this
behaves nicely, the cpu usage is very low:
# ps -AL -o pid,lwp,cmd,comm,rtprio,cpuid,pcpu | grep [x]dpsock
 1932  1932 ./xdpsock -r -p -i eth1     xdpsock              -     3  0.0
 1932  1933 ./xdpsock -r -p -i eth1     xdpsock              -     2  0.0

And strace shows nice orderly ppoll timeouts every second.
# strace -p 1932
strace: Process 1932 attached
ppoll([{fd=3, events=POLLIN}], 1, {tv_sec=0, tv_nsec=510616211}, NULL,
0) = 0 (Timeout)
ppoll([{fd=3, events=POLLIN}], 1, {tv_sec=1, tv_nsec=0}, NULL, 0) = 0 (Timeout)
ppoll([{fd=3, events=POLLIN}], 1, {tv_sec=1, tv_nsec=0}, NULL, 0) = 0 (Timeout)
...

Then I generate some traffic and ppoll() is not timing out anymore:
ppoll([{fd=3, events=POLLIN}], 1, {tv_sec=1, tv_nsec=0}, NULL, 0) = 1
([{fd=3, revents=POLLIN}], left {tv_sec=0, tv_nsec=999996790})
ppoll([{fd=3, events=POLLIN}], 1, {tv_sec=1, tv_nsec=0}, NULL, 0) = 1
([{fd=3, revents=POLLIN}], left {tv_sec=0, tv_nsec=999997260})
ppoll([{fd=3, events=POLLIN}], 1, {tv_sec=1, tv_nsec=0}, NULL, 0) = 1
([{fd=3, revents=POLLIN}], left {tv_sec=0, tv_nsec=999997100})

This is where it get's strange, if I stop the traffic, then strace no
longer generates any activity but the xdpsock cpu usage is way up:
# ps -AL -o pid,lwp,cmd,comm,rtprio,cpuid,pcpu | grep [x]dpsock
 1932  1932 ./xdpsock -r -p -i eth1     xdpsock              -     3 61.0
 1932  1933 ./xdpsock -r -p -i eth1     xdpsock              -     2  0.0

So is it getting stuck at while (ret != rcvd) in rx_drop()?

Is it a normal case to get past the poll() and then have
xsk_ring_cons__peek() not equal xsk_ring_prod__reserve()?

I see the added xsk_ring_prod__needs_wakeup() with the extra poll() in
the latest 5.4 kernels, but I don't think any of the needs_wakeup
stuff is in the 5.2 kernel. Is that needed for this case?

This is with a 5.2.21 preempt-rt kernel on arm64 using the macb driver
(so XDP_SKB and not XDP_DRV).

Any thoughts would be appreciated.

thanks,
Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ