lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 12 Jun 2024 09:44:13 -0700
From: YiFei Zhu <zhuyifei@...gle.com>
To: Maciej Fijalkowski <maciej.fijalkowski@...el.com>
Cc: Magnus Karlsson <magnus.karlsson@...il.com>, netdev@...r.kernel.org, bpf@...r.kernel.org, 
	Björn Töpel <bjorn@...nel.org>, 
	Magnus Karlsson <magnus.karlsson@...el.com>, Jonathan Lemon <jonathan.lemon@...il.com>, 
	Alexei Starovoitov <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>, 
	"David S . Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>, 
	Jesper Dangaard Brouer <hawk@...nel.org>, John Fastabend <john.fastabend@...il.com>, 
	Andrii Nakryiko <andrii@...nel.org>, Stanislav Fomichev <sdf@...gle.com>, 
	Willem de Bruijn <willemb@...gle.com>
Subject: Re: [RFC PATCH net-next 0/3] selftests: Add AF_XDP functionality test

On Wed, Jun 12, 2024 at 5:50 AM Maciej Fijalkowski
<maciej.fijalkowski@...el.com> wrote:
>
> On Wed, Jun 12, 2024 at 01:47:06PM +0200, Magnus Karlsson wrote:
> > On Tue, 11 Jun 2024 at 22:43, YiFei Zhu <zhuyifei@...gle.com> wrote:
> > >
> > > We have observed that hardware NIC drivers may have faulty AF_XDP
> > > implementations, and there seem to be a lack of a test of various modes
> > > in which AF_XDP could run. This series adds a test to verify that NIC
> > > drivers implements many AF_XDP features by performing a send / receive
> > > of a single UDP packet.
> > >
> > > I put the C code of the test under selftests/bpf because I'm not really
> > > sure how I'd build the BPF-related code without the selftests/bpf
> > > build infrastructure.
> >
> > Happy to see that you are contributing a number of new tests. Would it
> > be possible for you to integrate this into the xskxceiver framework?
> > You can find that in selftests/bpf too. By default, it will run its
> > tests using veth, but if you provide an interface name after the -i
> > option, it will run the tests over a real interface. I put the NIC in
> > loopback mode to use this feature, but feel free to add a new mode if
> > necessary. A lot of the setup and data plane code that you add already
> > exists in xskxceiver, so I would prefer if you could reuse it. Your
> > tests are new though and they would be valuable to have.
>
> +1
>
> I just don't believe that you guys were not aware that xskxceiver exist.
> Please provide us a proper explanation/justification why this was not
> fulfilling your needs and you decided to go with another test suite.

To answer this question, I can't speak for others, but I personally
was not fully aware.

Over a year ago when we were testing AF_XDP latency on internal NIC
drivers, we extended our internal latency test tool to support AF_XDP.
And that was when we observed the NICs we were testing had faulty
implementations - panics, packet corruptions, random drops; and we
decided to simplify the latency suite to add a simple pass/fail test
to our testing infrastructure, and we named it xsk_hw. The test was
specifically designed to test hardware NICs (rather than veth), and
there was a bunch of code around the test, to reserve & setup
machines, and to obtain information such as the IP addresses and the
host and next hop MACs addresses. At the time, the code was deemed too
dependent on our internal multi-machine-testing infrastructure to
upstream, but it has been running as part of our test suite since.

This brings us to recently. I was informed that upstream now have
drv-net, and now that upstream also has multi-machine testing, it's
time to upstream it. Hence this patch series, which I made after
adapting the code to use drv-net and network_helpers.

As for xskxceiver, for me personally, I discarded the idea after
reading the initial block comment of xskxceiver saying it spawns two
threads in a veth pair to test AF_XDP, which in my mind was like "okay
this doesn't test hardware NICs, and to extend that test to hardware
is probably a major rewrite that is probably not worth", so I did not
look too deeply into its code. I personally was unaware that it can
test a real interface, and that's partially my fault.

I'll take a look at xskxceiver and see how feasible it is to integrate
this into xskxceiver.

> >
> > You could make the default packet that is sent in xskxceiver be the
> > UDP packet that you want and then add all the other logic that you
> > have to a number of new tests that you introduce.
> >
> > > Tested on Google Cloud, with GVE:
> > >
> > >   $ sudo NETIF=ens4 REMOTE_TYPE=ssh \
> > >     REMOTE_ARGS="root@...138.15.235" \
> > >     LOCAL_V4="10.138.15.234" \
> > >     REMOTE_V4="10.138.15.235" \
> > >     LOCAL_NEXTHOP_MAC="42:01:0a:8a:00:01" \
> > >     REMOTE_NEXTHOP_MAC="42:01:0a:8a:00:01" \
> > >     python3 xsk_hw.py
> > >
> > >   KTAP version 1
> > >   1..22
> > >   ok 1 xsk_hw.ipv4_basic
> > >   ok 2 xsk_hw.ipv4_tx_skb_copy
> > >   ok 3 xsk_hw.ipv4_tx_skb_copy_force_attach
> > >   ok 4 xsk_hw.ipv4_rx_skb_copy
> > >   ok 5 xsk_hw.ipv4_tx_drv_copy
> > >   ok 6 xsk_hw.ipv4_tx_drv_copy_force_attach
> > >   ok 7 xsk_hw.ipv4_rx_drv_copy
> > >   [...]
> > >   # Exception| STDERR: b'/tmp/zzfhcqkg/pbgodkgjxsk_hw: recv_pfpacket: Timeout\n'
> > >   not ok 8 xsk_hw.ipv4_tx_drv_zerocopy
> > >   ok 9 xsk_hw.ipv4_tx_drv_zerocopy_force_attach
> > >   ok 10 xsk_hw.ipv4_rx_drv_zerocopy
> > >   [...]
> > >   # Exception| STDERR: b'/tmp/zzfhcqkg/pbgodkgjxsk_hw: connect sync client: max_retries\n'
> > >   [...]
> > >   # Exception| STDERR: b'/linux/tools/testing/selftests/bpf/xsk_hw: open_xsk: Device or resource busy\n'
> > >   not ok 11 xsk_hw.ipv4_rx_drv_zerocopy_fill_after_bind
> > >   ok 12 xsk_hw.ipv6_basic # SKIP Test requires IPv6 connectivity
> > >   [...]
> > >   ok 22 xsk_hw.ipv6_rx_drv_zerocopy_fill_after_bind # SKIP Test requires IPv6 connectivity
> > >   # Totals: pass:9 fail:2 xfail:0 xpass:0 skip:11 error:0
> > >
> > > YiFei Zhu (3):
> > >   selftests/bpf: Move rxq_num helper from xdp_hw_metadata to
> > >     network_helpers
> > >   selftests/bpf: Add xsk_hw AF_XDP functionality test
> > >   selftests: drv-net: Add xsk_hw AF_XDP functionality test
> > >
> > >  tools/testing/selftests/bpf/.gitignore        |   1 +
> > >  tools/testing/selftests/bpf/Makefile          |   7 +-
> > >  tools/testing/selftests/bpf/network_helpers.c |  27 +
> > >  tools/testing/selftests/bpf/network_helpers.h |  16 +
> > >  tools/testing/selftests/bpf/progs/xsk_hw.c    |  72 ++
> > >  tools/testing/selftests/bpf/xdp_hw_metadata.c |  27 +-
> > >  tools/testing/selftests/bpf/xsk_hw.c          | 844 ++++++++++++++++++
> > >  .../testing/selftests/drivers/net/hw/Makefile |   1 +
> > >  .../selftests/drivers/net/hw/xsk_hw.py        | 133 +++
> > >  9 files changed, 1102 insertions(+), 26 deletions(-)
> > >  create mode 100644 tools/testing/selftests/bpf/progs/xsk_hw.c
> > >  create mode 100644 tools/testing/selftests/bpf/xsk_hw.c
> > >  create mode 100755 tools/testing/selftests/drivers/net/hw/xsk_hw.py
> > >
> > > --
> > > 2.45.2.505.gda0bf45e8d-goog
> > >
> > >
> >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ