[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <9811659B-6D5A-4C4F-9CF8-735E9CA6DE4E@appneta.com>
Date: Tue, 28 May 2019 09:57:35 -0700
From: Fred Klassen <fklassen@...neta.com>
To: Willem de Bruijn <willemdebruijn.kernel@...il.com>
Cc: "David S. Miller" <davem@...emloft.net>,
Alexey Kuznetsov <kuznet@....inr.ac.ru>,
Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>,
Shuah Khan <shuah@...nel.org>,
Network Development <netdev@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
linux-kselftest@...r.kernel.org,
Willem de Bruijn <willemb@...gle.com>
Subject: Re: [PATCH net 4/4] net/udpgso_bench_tx: audit error queue
> On May 28, 2019, at 8:08 AM, Willem de Bruijn <willemdebruijn.kernel@...il.com> wrote:
>
I will push up latest patches soon.
I did some testing and discovered that only TCP audit tests failed. They
failed much less often when enabling poll. Once in about 20 runs
still failed. Therefore I commented out the TCP audit tests.
As for the other tests, this is what I got with poll() disabled…
udp gso zerocopy timestamp audit
udp rx: 1611 MB/s 1148129 calls/s
udp tx: 1659 MB/s 28146 calls/s 28146 msg/s
udp rx: 1686 MB/s 1201494 calls/s
udp tx: 1685 MB/s 28579 calls/s 28579 msg/s
udp rx: 1685 MB/s 1200402 calls/s
udp tx: 1683 MB/s 28552 calls/s 28552 msg/s
Summary over 3.000 seconds...
sum udp tx: 1716 MB/s 85277 calls (28425/s) 85277 msgs (28425/s)
Tx Timestamps: 85277 received 0 errors
Zerocopy acks: 85277 received 0 errors
Here you see that with poll() enabled, it is a bit slower, so I don’t have it
enabled in udpgso_bench.sh …
udp gso zerocopy timestamp audit
udp rx: 1591 MB/s 1133945 calls/s
udp tx: 1613 MB/s 27358 calls/s 27358 msg/s
udp rx: 1644 MB/s 1171674 calls/s
udp tx: 1643 MB/s 27869 calls/s 27869 msg/s
udp rx: 1643 MB/s 1170666 calls/s
udp tx: 1641 MB/s 27845 calls/s 27845 msg/s
Summary over 3.000 seconds...
sum udp tx: 1671 MB/s 83072 calls (27690/s) 83072 msgs (27690/s)
Tx Timestamps: 83072 received 0 errors
Zerocopy acks: 83072 received 0 errors
You may be interested that I reduced test lengths from 4 to 3 seconds,
but I am still getting 3 reports per test. I picked up the extra report by
changing 'if (tnow > treport)’ to 'if (tnow >= treport)’
> The only issue specific to GSO is that xmit_more can forego this
> doorbell until the last segment. We want to complicate this logic with
> a special case based on tx_flags. A process that cares should either
> not use GSO, or the timestamp should be associated with the last
> segment as I've been arguing so far.
This is the area I was thinking of looking into. I’m not sure it will work
or that it will be too messy. It may be worth a little bit of digging to
see if there is anything there. That will be down the road a bu
>>
>> I’ll get back to you when I have tested this more thoroughly. Early results
>> suggest that adding the -P poll() option has fixed it without any appreciable
>> performance hit. I’ll share raw results with you, and we can make a final
>> decision together.
>
> In the main loop? It still is peculiar that notifications appear to go
> missing unless the process blocks waiting for them. Nothing in
> sock_zerocopy_callback or the queueing onto the error queue should
> cause drops, as far as I know.
>
Now that I know the issue is only in TCP, I can speculate that all bytes are
being reported, but done with fewer messages. It may warrant some
investigation in case there is some kind of bug.
> Indeed. Ideally even run all tests, but return error if any failed,
> like this recent patch
>
> selftests/bpf: fail test_tunnel.sh if subtests fail
> https://patchwork.ozlabs.org/patch/1105221/
>
> but that may be a lot of code churn and better left to a separate patch.
I like it. I have it coded up, and it seems to work well. I’ll make a
separate commit in the patch set so we can yank it out if you feel
it is too much
Powered by blists - more mailing lists