lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAF=yD-JtbsK5s1Y9Ze79aXLuOT5s-3=ZsVB3kdDbuqyZX3HYSg@mail.gmail.com>
Date:   Fri, 24 Feb 2017 19:25:31 -0500
From:   Willem de Bruijn <willemdebruijn.kernel@...il.com>
To:     Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc:     Network Development <netdev@...r.kernel.org>,
        Willem de Bruijn <willemb@...gle.com>
Subject: Re: [PATCH RFC v2 00/12] socket sendmsg MSG_ZEROCOPY

On Fri, Feb 24, 2017 at 6:03 PM, Alexei Starovoitov
<alexei.starovoitov@...il.com> wrote:
> On Wed, Feb 22, 2017 at 11:38:49AM -0500, Willem de Bruijn wrote:
>>
>> * Limitations / Known Issues
>>
>> - PF_INET6 is not yet supported.
>
> we struggled so far to make it work in our setups which are ipv6 only.
> Looking at patches it seems the code should just work.
> What particularly is missing ?
>
> Great stuff. Looking forward to net-next reopening :)

Thanks for taking the feature for a spin!

The udp and raw paths require separate ipv6 patches. TCP should indeed
just work. I just ran a slightly modified snd_zerocopy_lo with good
results as well as a hacked netperf to another host.

I should have had ipv6 from the start, of course. Will add it before
resubmitting when net-next opens. For now, quick hack to
snd_zerocopy_lo.c:

diff --git a/tools/testing/selftests/net/snd_zerocopy_lo.c
b/tools/testing/selftests/net/snd_zerocopy_lo.c
index 309b016a4fd5..38a165e2af64 100644
--- a/tools/testing/selftests/net/snd_zerocopy_lo.c
+++ b/tools/testing/selftests/net/snd_zerocopy_lo.c
@@ -453,7 +453,7 @@ static int do_setup_rx(int domain, int type, int protocol)

 static void do_setup_and_run(int domain, int type, int protocol)
 {
-       struct sockaddr_in addr;
+       struct sockaddr_in6 addr;
        socklen_t alen;
        int fdr, fdt, ret;

@@ -468,8 +468,8 @@ static void do_setup_and_run(int domain, int type,
int protocol)

        if (domain != PF_PACKET) {
                memset(&addr, 0, sizeof(addr));
-               addr.sin_family = AF_INET;
-               addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);
+               addr.sin6_family = AF_INET6;
+               addr.sin6_addr = in6addr_loopback;
                alen = sizeof(addr);

                if (bind(fdr, (void *) &addr, sizeof(addr)))
@@ -589,7 +589,7 @@ int main(int argc, char **argv)
        if (cfg_test_raw_hdrincl)
                do_setup_and_run(PF_INET, SOCK_RAW, IPPROTO_RAW);
        if (cfg_test_tcp)
-               do_setup_and_run(PF_INET, SOCK_STREAM, 0);
+               do_setup_and_run(PF_INET6, SOCK_STREAM, 0);


Loopback zerocopy is disabled in RFCv2, so to use snd_zerocopy_lo to verify the
feature requires this hack in skb_orphan_frags_rx:

 static inline int skb_orphan_frags_rx(struct sk_buff *skb, gfp_t gfp_mask)
 {
-       if (likely(!skb_zcopy(skb)))
-               return 0;
-       return skb_copy_ubufs(skb, gfp_mask);
+       return skb_orphan_frags(skb, gfp_mask);
 }

With this change, I see

$ ./snd_zerocopy_lo_ipv6 -t
test socket(10, 1, 0)
cpu: 23
rx=106364 (6637 MB) tx=106364 txc=0
rx=209736 (13088 MB) tx=209736 txc=0
rx=314524 (19627 MB) tx=314524 txc=0
rx=419424 (26174 MB) tx=419424 txc=0
OK. All tests passed

$ ./snd_zerocopy_lo_ipv6 -t -z
test socket(10, 1, 0)
cpu: 23
rx=239792 (14964 MB) tx=239792 txc=239786
rx=477376 (29790 MB) tx=477376 txc=477370
rx=715016 (44620 MB) tx=715016 txc=715010
rx=952820 (59460 MB) tx=952820 txc=952814
OK. All tests passed

In comparison, the same without the change

$ ./snd_zerocopy_lo_ipv6 -t
test socket(10, 1, 0)
cpu: 23
rx=109908 (6858 MB) tx=109908 txc=0
rx=217100 (13548 MB) tx=217100 txc=0
rx=326584 (20380 MB) tx=326584 txc=0
rx=429568 (26807 MB) tx=429568 txc=0
OK. All tests passed

$ ./snd_zerocopy_lo_ipv6 -t  -z
test socket(10, 1, 0)
cpu: 23
rx=87636 (5468 MB) tx=87636 txc=87630
rx=174328 (10878 MB) tx=174328 txc=174322
rx=260360 (16247 MB) tx=260360 txc=260354
rx=346512 (21623 MB) tx=346512 txc=346506

Here the sk_buff hits the deep copy in skb_copy_ubufs called from
__netif_receive_skb_core, which actually degrades performance versus
copying as part of the sendmsg() syscall.

The netperf change is to add MSG_ZEROCOPY to send() in send_tcp_stream
and also adding a recvmsg(send_socket, &msg, MSG_ERRQUEUE) to the same
function, preferably called only once every N iterations. This does
not take any additional explicit references on the send_ring element
while data is in flight, so is really a hack, but ring contents should
be static throughout the test. I did not modify the omni tests, so
this requires building with --no-omni.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ