[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9329021e-2d77-7e90-b0e2-8b391508f6cb@mellanox.com>
Date: Mon, 28 May 2018 12:12:42 +0300
From: Tariq Toukan <tariqt@...lanox.com>
To: David Miller <davem@...emloft.net>, edumazet@...gle.com
Cc: netdev@...r.kernel.org, fw@...len.de, herbert@...dor.apana.org.au,
tgraf@...g.ch, brouer@...hat.com, alex.aring@...il.com,
stefan@....samsung.com, ktkhai@...tuozzo.com,
eric.dumazet@...il.com, Moshe Shemesh <moshe@...lanox.com>,
Eran Ben Elisha <eranbe@...lanox.com>
Subject: Re: [PATCH v4 net-next 00/19] inet: frags: bring rhashtables to IP
defrag
On 01/04/2018 6:25 AM, David Miller wrote:
> From: Eric Dumazet <edumazet@...gle.com>
> Date: Sat, 31 Mar 2018 12:58:41 -0700
>
>> IP defrag processing is one of the remaining problematic layer in linux.
>>
>> It uses static hash tables of 1024 buckets, and up to 128 items per bucket.
>>
>> A work queue is supposed to garbage collect items when host is under memory
>> pressure, and doing a hash rebuild, changing seed used in hash computations.
>>
>> This work queue blocks softirqs for up to 25 ms when doing a hash rebuild,
>> occurring every 5 seconds if host is under fire.
>>
>> Then there is the problem of sharing this hash table for all netns.
>>
>> It is time to switch to rhashtables, and allocate one of them per netns
>> to speedup netns dismantle, since this is a critical metric these days.
>>
>> Lookup is now using RCU, and 64bit hosts can now provision whatever amount
>> of memory needed to handle the expected workloads.
> ...
>
> Series applied, thanks Eric.
>
Hi Eric,
Recently my colleague (Moshe Shemesh) got a failure in upstream
regression, which is related to this patchset. We don’t see the failure
before it was merged.
We checked again on net-next (from May 24th), it still reproduces.
The test case runs netperf with ipv6 udp single stream (64K message size).
After the change we see huge packet loss:
145,134 messages failed out of 145,419 (only 285 fully received)
[root@...-l-vrt-67100-104 ~]# netperf -H
fe80::e61d:2dff:feca:c7c3%ens9,inet6 -t udp_stream --
MIGRATED UDP STREAM TEST from ::0 (::) port 0 AF_INET6 to
fe80::e61d:2dff:feca:c7c3%ens9 () port 0 AF_INET6
Socket Message Elapsed Messages
Size Size Time Okay Errors Throughput
bytes bytes secs # # 10^6bits/sec
212992 65507 10.00 145419 0 7620.35
212992 10.00 285 14.93
By checking nstat counters we see that Ip6ReasmFails got very high:
#kernel
...
Ip6InReceives 6665965 0.0
Ip6InDelivers 300 0.0
Ip6OutRequests 9 0.0
Ip6ReasmReqds 6665950 0.0
Ip6ReasmOKs 285 0.0
Ip6ReasmFails 6650890 0.0
Ip6InOctets 9813929354 0.0
Ip6OutOctets 2608 0.0
Ip6InNoECTPkts 6665965 0.0
...
Udp6InDatagrams 286 0.0
...
Same test on kernel without the patchset got low failure rate:
Only 810 messages failed out of 114,112 (113,302 fully received)
[root@...-l-vrt-67100-104 ~]# netperf -H
fe80::e61d:2dff:feca:c7c3%ens9,inet6 -t udp_stream --
MIGRATED UDP STREAM TEST from ::0 (::) port 0 AF_INET6 to
fe80::e61d:2dff:feca:c7c3%ens9 () port 0 AF_INET6
Socket Message Elapsed Messages
Size Size Time Okay Errors Throughput
bytes bytes secs # # 10^6bits/sec
212992 65507 10.00 114112 0 5979.69
212992 10.00 113302 5937.24
nstat counters to compare:
#kernel
...
Ip6InReceives 5249166 0.0
Ip6InDelivers 114126 0.0
Ip6OutRequests 8 0.0
Ip6ReasmReqds 5249152 0.0
Ip6ReasmOKs 114112 0.0
Ip6InOctets 7728009224 0.0
Ip6OutOctets 2544 0.0
Ip6InNoECTPkts 5249166 0.0
...
Udp6InDatagrams 113303 0.0
Udp6InErrors 810 0.0
Udp6RcvbufErrors 810 0.0
...
We did not get to bisect within the patchset yet.
Regards,
Tariq and Moshe
Powered by blists - more mailing lists