[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7ab35f56-cb8e-ddbc-ce8a-148a943dec92@gmail.com>
Date: Tue, 11 Dec 2018 19:01:35 +0200
From: Risto Pajula <or.pajula@...il.com>
To: Heiner Kallweit <hkallweit1@...il.com>
Cc: "David S. Miller" <davem@...emloft.net>,
Alexey Kuznetsov <kuznet@....inr.ac.ru>,
Realtek linux nic maintainers <nic_swsd@...ltek.com>,
netdev@...r.kernel.org
Subject: Re: IP (rtl8169) forwarding bug (performance)
Hello.
A freshly built 4.20.0-rc6-next-20181210-lp150.12.25-default waited me
when I got back from work, but unfortunately it did not help at all, it
behaved exactly in same manner.
Reply from 192.168.0.1: bytes=32 time<1ms TTL=64
Reply from 192.168.0.1: bytes=32 time<1ms TTL=64
Reply from 192.168.0.1: bytes=32 time<1ms TTL=64
Reply from 192.168.0.1: bytes=32 time<1ms TTL=64
Reply from 192.168.0.1: bytes=32 time<1ms TTL=64
Reply from 192.168.0.1: bytes=32 time=73ms TTL=64
Reply from 192.168.0.1: bytes=32 time<1ms TTL=64
Reply from 192.168.0.1: bytes=32 time=83ms TTL=64
Reply from 192.168.0.1: bytes=32 time=307ms TTL=64
Reply from 192.168.0.1: bytes=32 time=115ms TTL=64
Reply from 192.168.0.1: bytes=32 time<1ms TTL=64
Reply from 192.168.0.1: bytes=32 time<1ms TTL=64
BR.
Risto
On 11.12.2018 0:20, Risto Pajula wrote:
> Hello.
>
> I have not yet tested with linux-next but I will, thanks for pointing
> that out.
>
> ...But I have studied the problem a bit more, indeed it seems that the
> rtl8169 transmission queue gets stuck.
>
> Below is some trace log. Starting from 802026 a burst of frames is
> forwarded from eth0 to eth1 and when we get to the rtl_tx only two
> frames have been transmitted.. (Should have had 1,25 ms to transmit
> them).. Also then the sequence seems to repeat for very long time that
> only two massages get transmitted.
>
> 800541: <idle>-0 [001] ..s. 7237.698143:
> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 1506 entry: 34
> cur_tx: 4406946 frags: 0 dirty_tx: 4406946 tx_left: 0
> 800671: <idle>-0 [001] ..s. 7237.698189:
> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 1506 entry: 35
> cur_tx: 4406947 frags: 0 dirty_tx: 4406946 tx_left: 1
> 800835: <idle>-0 [001] ..s. 7237.698235:
> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 1506 entry: 36
> cur_tx: 4406948 frags: 0 dirty_tx: 4406946 tx_left: 2
> 800905: <idle>-0 [000] ..s. 7237.698255: rtl8169_poll:
> ORP33 eth1 rtl_tx len: 1506 entry: 37 cur_tx: 4406949 frags: 0
> dirty_tx: 4406946 tx_left: 3
> 800932: <idle>-0 [000] ..s. 7237.698263: rtl8169_poll:
> ORP33 eth1 rtl_tx len: 1506 entry: 37 cur_tx: 4406949 frags: 0
> dirty_tx: 4406947 tx_left: 2
> 800960: <idle>-0 [000] ..s. 7237.698270: rtl8169_poll:
> ORP33 eth1 rtl_tx len: 1506 entry: 37 cur_tx: 4406949 frags: 0
> dirty_tx: 4406948 tx_left: 1
> 800998: <idle>-0 [001] ..s. 7237.698282:
> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 1506 entry: 37
> cur_tx: 4406949 frags: 0 dirty_tx: 4406949 tx_left: 0
> 801085: <idle>-0 [000] ..s. 7237.698312: rtl8169_poll:
> ORP33 eth1 rtl_tx len: 1506 entry: 38 cur_tx: 4406950 frags: 0
> dirty_tx: 4406949 tx_left: 1
> 801137: <idle>-0 [001] ..s. 7237.698329:
> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 1506 entry: 38
> cur_tx: 4406950 frags: 0 dirty_tx: 4406950 tx_left: 0
> 801229: <idle>-0 [000] ..s. 7237.698363: rtl8169_poll:
> ORP33 eth1 rtl_tx len: 1506 entry: 39 cur_tx: 4406951 frags: 0
> dirty_tx: 4406950 tx_left: 1
> 801264: <idle>-0 [001] ..s. 7237.698375:
> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 1506 entry: 39
> cur_tx: 4406951 frags: 0 dirty_tx: 4406951 tx_left: 0
> 801356: <idle>-0 [000] ..s. 7237.698409: rtl8169_poll:
> ORP33 eth1 rtl_tx len: 1506 entry: 40 cur_tx: 4406952 frags: 0
> dirty_tx: 4406951 tx_left: 1
> 801391: <idle>-0 [001] ..s. 7237.698421:
> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 1506 entry: 40
> cur_tx: 4406952 frags: 0 dirty_tx: 4406952 tx_left: 0
> 801482: <idle>-0 [000] ..s. 7237.698455: rtl8169_poll:
> ORP33 eth1 rtl_tx len: 1506 entry: 41 cur_tx: 4406953 frags: 0
> dirty_tx: 4406952 tx_left: 1
> 801518: <idle>-0 [001] ..s. 7237.698468:
> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 1506 entry: 41
> cur_tx: 4406953 frags: 0 dirty_tx: 4406953 tx_left: 0
> 801610: <idle>-0 [000] ..s. 7237.698502: rtl8169_poll:
> ORP33 eth1 rtl_tx len: 1506 entry: 42 cur_tx: 4406954 frags: 0
> dirty_tx: 4406953 tx_left: 1
> 801645: <idle>-0 [001] ..s. 7237.698514:
> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 1506 entry: 42
> cur_tx: 4406954 frags: 0 dirty_tx: 4406954 tx_left: 0
> 801737: <idle>-0 [000] ..s. 7237.698548: rtl8169_poll:
> ORP33 eth1 rtl_tx len: 1506 entry: 43 cur_tx: 4406955 frags: 0
> dirty_tx: 4406954 tx_left: 1
> 801772: <idle>-0 [001] ..s. 7237.698560:
> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 1506 entry: 43
> cur_tx: 4406955 frags: 0 dirty_tx: 4406955 tx_left: 0
> 801864: <idle>-0 [000] ..s. 7237.698594: rtl8169_poll:
> ORP33 eth1 rtl_tx len: 1506 entry: 44 cur_tx: 4406956 frags: 0
> dirty_tx: 4406955 tx_left: 1
> 801899: <idle>-0 [001] ..s. 7237.698607:
> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 1506 entry: 44
> cur_tx: 4406956 frags: 0 dirty_tx: 4406956 tx_left: 0
> 801984: <idle>-0 [000] ..s. 7237.698640: rtl8169_poll:
> ORP33 eth1 rtl_tx len: 1506 entry: 45 cur_tx: 4406957 frags: 0
> dirty_tx: 4406956 tx_left: 1
> 802026: <idle>-0 [001] ..s. 7237.698655:
> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 1506 entry: 45
> cur_tx: 4406957 frags: 0 dirty_tx: 4406957 tx_left: 0
> 802129: <idle>-0 [001] ..s. 7237.698701:
> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 1506 entry: 46
> cur_tx: 4406958 frags: 0 dirty_tx: 4406957 tx_left: 1
> 802232: <idle>-0 [001] ..s. 7237.698747:
> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 1506 entry: 47
> cur_tx: 4406959 frags: 0 dirty_tx: 4406957 tx_left: 2
> 802335: <idle>-0 [001] ..s. 7237.698793:
> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 1506 entry: 48
> cur_tx: 4406960 frags: 0 dirty_tx: 4406957 tx_left: 3
> 802438: <idle>-0 [001] ..s. 7237.698840:
> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 1506 entry: 49
> cur_tx: 4406961 frags: 0 dirty_tx: 4406957 tx_left: 4
> 802541: <idle>-0 [001] ..s. 7237.698885:
> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 1506 entry: 50
> cur_tx: 4406962 frags: 0 dirty_tx: 4406957 tx_left: 5
> 802644: <idle>-0 [001] ..s. 7237.698932:
> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 1506 entry: 51
> cur_tx: 4406963 frags: 0 dirty_tx: 4406957 tx_left: 6
> 802747: <idle>-0 [001] ..s. 7237.698978:
> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 1506 entry: 52
> cur_tx: 4406964 frags: 0 dirty_tx: 4406957 tx_left: 7
> 802851: <idle>-0 [001] ..s. 7237.699025:
> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 1506 entry: 53
> cur_tx: 4406965 frags: 0 dirty_tx: 4406957 tx_left: 8
> 805094: <idle>-0 [000] ..s. 7237.699871: rtl8169_poll:
> ORP33 eth1 rtl_tx len: 1506 entry: 54 cur_tx: 4406966 frags: 0
> dirty_tx: 4406957 tx_left: 9
> 805126: <idle>-0 [000] ..s. 7237.699878: rtl8169_poll:
> ORP33 eth1 rtl_tx len: 1506 entry: 54 cur_tx: 4406966 frags: 0
> dirty_tx: 4406958 tx_left: 8
> 805197: <idle>-0 [001] ..s. 7237.699905:
> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 1506 entry: 54
> cur_tx: 4406966 frags: 0 dirty_tx: 4406959 tx_left: 7
> 805234: <idle>-0 [001] ..s. 7237.699925:
> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 1506 entry: 55
> cur_tx: 4406967 frags: 0 dirty_tx: 4406959 tx_left: 8
> 811446: <idle>-0 [000] ..s. 7237.702344: rtl8169_poll:
> ORP33 eth1 rtl_tx len: 1506 entry: 56 cur_tx: 4406968 frags: 0
> dirty_tx: 4406959 tx_left: 9
> 811479: <idle>-0 [000] ..s. 7237.702354: rtl8169_poll:
> ORP33 eth1 rtl_tx len: 1506 entry: 56 cur_tx: 4406968 frags: 0
> dirty_tx: 4406960 tx_left: 8
> 811641: <idle>-0 [000] ..s. 7237.702395:
> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 1506 entry: 56
> cur_tx: 4406968 frags: 0 dirty_tx: 4406961 tx_left: 7
> 811733: <idle>-0 [000] ..s. 7237.702416:
> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 1506 entry: 57
> cur_tx: 4406969 frags: 0 dirty_tx: 4406961 tx_left: 8
> 812050: hping3-2871 [000] ..s. 7237.704770: rtl8169_poll:
> ORP33 eth1 rtl_tx len: 1506 entry: 58 cur_tx: 4406970 frags: 0
> dirty_tx: 4406961 tx_left: 9
> 812061: hping3-2871 [000] ..s. 7237.704777: rtl8169_poll:
> ORP33 eth1 rtl_tx len: 1506 entry: 58 cur_tx: 4406970 frags: 0
> dirty_tx: 4406962 tx_left: 8
> 812124: hping3-2871 [000] ..s. 7237.704811:
> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 1506 entry: 58
> cur_tx: 4406970 frags: 0 dirty_tx: 4406963 tx_left: 7
> 812176: hping3-2871 [000] ..s. 7237.704831:
> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 1506 entry: 59
> cur_tx: 4406971 frags: 0 dirty_tx: 4406963 tx_left: 8
> 812300: <idle>-0 [000] ..s. 7237.707247: rtl8169_poll:
> ORP33 eth1 rtl_tx len: 1506 entry: 60 cur_tx: 4406972 frags: 0
> dirty_tx: 4406963 tx_left: 9
> 812312: <idle>-0 [000] ..s. 7237.707257: rtl8169_poll:
> ORP33 eth1 rtl_tx len: 1506 entry: 60 cur_tx: 4406972 frags: 0
> dirty_tx: 4406964 tx_left: 8
> 812389: <idle>-0 [000] ..s. 7237.707297:
> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 1506 entry: 60
> cur_tx: 4406972 frags: 0 dirty_tx: 4406965 tx_left: 7
> 812426: <idle>-0 [000] ..s. 7237.707318:
> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 1506 entry: 61
> cur_tx: 4406973 frags: 0 dirty_tx: 4406965 tx_left: 8
> 812827: <idle>-0 [000] ..s. 7237.709690: rtl8169_poll:
> ORP33 eth1 rtl_tx len: 1506 entry: 62 cur_tx: 4406974 frags: 0
> dirty_tx: 4406965 tx_left: 9
> 812838: <idle>-0 [000] ..s. 7237.709700: rtl8169_poll:
> ORP33 eth1 rtl_tx len: 1506 entry: 62 cur_tx: 4406974 frags: 0
> dirty_tx: 4406966 tx_left: 8
> .....
>
>
> BR.
> Risto
>
> On 10.12.2018 23:26, Heiner Kallweit wrote:
>> Did you test also with the latest linux-next kernel? Some recent
>> changes like 2e6eedb4813e
>> "r8169: make use of xmit_more and __netdev_sent_queue" may have a
>> positive impact.
>>
>> On 10.12.2018 00:28, Risto Pajula wrote:
>>> Hello.
>>>
>>> Old subject: "Re: IP fragmentation performance and don't fragment
>>> bug when forwarding
>>>
>>>
>>> I have now been tracing the kernel and finding the bug seems
>>> difficult. I think the bug is combination of several things, likely
>>> cause is that it only occurs with rtl8169 and how it is using the
>>> netdevapi/DQL api.
>>>
>>> From my investigations seems that following happens:
>>>
>>> *Burst of frames is received from internet. (eth0)
>>> *These are accepted and forwarded to the (eth1)
>>> *DQL hits the limit, this causes the scheduling for the device to be
>>> stopped (__QUEUE_STATE_STACK_XOFF)
>>> *This and combination of some timing causes the performance
>>> degradation, messages are only scheduled for transmission in soft
>>> Irq context.
>>>
>>> I still do now know, why the DQL hits the limits.. I think it should
>>> not, maybe something undesirable first happens with rtl8169.
>>>
>>> I inserted following trace printk functions to the code..
>>>
>>> rtl8169_start_xmit:
>>> trace_printk("ORP33 %s rtl8169_start_xmit len: %d entry: %u cur_tx:
>>> %u frags: %d dirty_tx: %u tx_left: %u\n" , tp->dev->name, skb->len,
>>> entry, tp->cur_tx, frags, tp->dirty_tx, (tp->cur_tx - tp->dirty_tx) );
>>>
>>>
>>> rtl_tx:
>>> trace_printk("ORP33 %s rtl_tx len: %d entry: %u cur_tx: %u frags: %d
>>> dirty_tx: %u tx_left: %u\n" , tp->dev->name, tx_skb->skb->len,
>>> tp->cur_tx % NUM_TX_DESC, tp->cur_tx, 0, dirty_tx, tx_left );
>>>
>>>
>>> Here is some grepped output only filtering the ping packet in two
>>> different situations:
>>>
>>> //trace when downloading ubuntu (~20MB/s), only 3 secs fit to
>>> buffer. ping is all the time normal....
>>> //orig tracing.out3754
>>> 117493: <idle>-0 [000] ..s. 5235.407116: rtl8169_poll:
>>> ORP33 eth1 rtl_tx len: 54 entry: 57 cur_tx: 3747641 frags: 0
>>> dirty_tx: 3747640 tx_left: 1
>>> 118704: <idle>-0 [000] ..s. 5235.416057: rtl8169_poll:
>>> ORP33 eth1 rtl_tx len: 54 entry: 20 cur_tx: 3747668 frags: 0
>>> dirty_tx: 3747667 tx_left: 1
>>> 346319: hping3-2871 [000] .... 5236.002456:
>>> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 54 entry: 8
>>> cur_tx: 3754312 frags: 0 dirty_tx: 3754312 tx_left: 0
>>> 346337: hping3-2871 [000] ..s. 5236.002475: rtl8169_poll:
>>> ORP33 eth1 rtl_tx len: 54 entry: 9 cur_tx: 3754313 frags: 0
>>> dirty_tx: 3754312 tx_left: 1
>>> 1277155: hping3-2871 [001] .... 5237.002627:
>>> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 54 entry: 38
>>> cur_tx: 3766502 frags: 0 dirty_tx: 3766502 tx_left: 0
>>> 1277173: <idle>-0 [000] ..s. 5237.002646:
>>> rtl8169_poll: ORP33 eth1 rtl_tx len: 54 entry: 39 cur_tx: 3766503
>>> frags: 0 dirty_tx: 3766502 tx_left: 1
>>>
>>>
>>> //imcp messages from route to intenal network when TCP stream is on.
>>> tracing.out3236/eth1_ring.txt is the original
>>> //no idea why the first frame is tripled...
>>> 200792: hping3-2871 [001] .... 3239.593384:
>>> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 54 entry: 34
>>> cur_tx: 2132770 frags: 0 dirty_tx: 2132770 tx_left: 0
>>> 201578: hping3-2871 [001] .... 3240.593625:
>>> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 54 entry: 40
>>> cur_tx: 2132776 frags: 0 dirty_tx: 2132776 tx_left: 0
>>> 341061: <idle>-0 [000] ..s. 3241.738011:
>>> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 54 entry: 52
>>> cur_tx: 2133684 frags: 0 dirty_tx: 2133680 tx_left: 4
>>> 342112: <idle>-0 [000] ..s. 3241.745297: rtl8169_poll:
>>> ORP33 eth1 rtl_tx len: 54 entry: 59 cur_tx: 2133691 frags: 0
>>> dirty_tx: 2133684 tx_left: 7
>>> 500126: <idle>-0 [000] ..s. 3242.739500:
>>> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 54 entry: 49
>>> cur_tx: 2134513 frags: 0 dirty_tx: 2134505 tx_left: 8
>>> 501739: <idle>-0 [000] ..s. 3242.751654: rtl8169_poll:
>>> ORP33 eth1 rtl_tx len: 54 entry: 58 cur_tx: 2134522 frags: 0
>>> dirty_tx: 2134513 tx_left: 9
>>> 646220: <idle>-0 [000] ..s. 3243.745761:
>>> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 54 entry: 41
>>> cur_tx: 2135273 frags: 0 dirty_tx: 2135267 tx_left: 6
>>> 647633: <idle>-0 [000] ..s. 3243.755486: rtl8169_poll:
>>> ORP33 eth1 rtl_tx len: 54 entry: 49 cur_tx: 2135281 frags: 0
>>> dirty_tx: 2135273 tx_left: 8
>>> 802878: <idle>-0 [000] ..s. 3244.739947:
>>> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 54 entry: 23
>>> cur_tx: 2136087 frags: 0 dirty_tx: 2136081 tx_left: 6
>>> 804298: <idle>-0 [000] ..s. 3244.749677: rtl8169_poll:
>>> ORP33 eth1 rtl_tx len: 54 entry: 31 cur_tx: 2136095 frags: 0
>>> dirty_tx: 2136087 tx_left: 8
>>> 961190: <idle>-0 [000] ..s. 3245.746217:
>>> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 54 entry: 16
>>> cur_tx: 2136912 frags: 0 dirty_tx: 2136906 tx_left: 6
>>> 962610: <idle>-0 [000] ..s. 3245.755946: rtl8169_poll:
>>> ORP33 eth1 rtl_tx len: 54 entry: 24 cur_tx: 2136920 frags: 0
>>> dirty_tx: 2136912 tx_left: 8
>>> 1118044: <idle>-0 [000] ..s. 3246.740336:
>>> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 54 entry: 62
>>> cur_tx: 2137726 frags: 0 dirty_tx: 2137720 tx_left: 6
>>> 1119443: <idle>-0 [000] ..s. 3246.750047:
>>> rtl8169_poll: ORP33 eth1 rtl_tx len: 54 entry: 5 cur_tx: 2137733
>>> frags: 0 dirty_tx: 2137726 tx_left: 7
>>> 1264047: <idle>-0 [000] ..s. 3247.709202:
>>> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 54 entry: 29
>>> cur_tx: 2138525 frags: 0 dirty_tx: 2138518 tx_left: 7
>>> 1264740: <idle>-0 [000] ..s. 3247.718918:
>>> rtl8169_poll: ORP33 eth1 rtl_tx len: 54 entry: 33 cur_tx: 2138529
>>> frags: 0 dirty_tx: 2138525 tx_left: 4
>>> 1419958: <idle>-0 [000] ..s. 3248.736436:
>>> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 54 entry: 48
>>> cur_tx: 2139312 frags: 0 dirty_tx: 2139305 tx_left: 7
>>> 1421551: <idle>-0 [000] ..s. 3248.746189:
>>> rtl8169_poll: ORP33 eth1 rtl_tx len: 54 entry: 56 cur_tx: 2139320
>>> frags: 0 dirty_tx: 2139312 tx_left: 8
>>> 1578746: <idle>-0 [000] ..s. 3249.742702:
>>> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 54 entry: 46
>>> cur_tx: 2140142 frags: 0 dirty_tx: 2140135 tx_left: 7
>>> 1580194: <idle>-0 [000] ..s. 3249.752457:
>>> rtl8169_poll: ORP33 eth1 rtl_tx len: 54 entry: 54 cur_tx: 2140150
>>> frags: 0 dirty_tx: 2140142 tx_left: 8
>>> 1729597: <idle>-0 [000] ..s. 3250.746839:
>>> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 54 entry: 36
>>> cur_tx: 2140964 frags: 0 dirty_tx: 2140957 tx_left: 7
>>> 1731015: <idle>-0 [000] ..s. 3250.756594:
>>> rtl8169_poll: ORP33 eth1 rtl_tx len: 54 entry: 43 cur_tx: 2140971
>>> frags: 0 dirty_tx: 2140964 tx_left: 7
>>> 1881028: <idle>-0 [000] ..s. 3251.740991:
>>> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 54 entry: 19
>>> cur_tx: 2141779 frags: 0 dirty_tx: 2141772 tx_left: 7
>>> 1881381: <idle>-0 [000] ..s. 3251.750718:
>>> rtl8169_poll: ORP33 eth1 rtl_tx len: 54 entry: 26 cur_tx: 2141786
>>> frags: 0 dirty_tx: 2141779 tx_left: 7
>>> //something happens, the output queue empties.. now it is fast, and
>>> rtl8169_start_xmit is not called from soft_irq
>>> 1896178: hping3-2871 [001] .... 3252.595994:
>>> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 54 entry: 23
>>> cur_tx: 2141911 frags: 0 dirty_tx: 2141911 tx_left: 0
>>> 1896196: <idle>-0 [000] ..s. 3252.596055:
>>> rtl8169_poll: ORP33 eth1 rtl_tx len: 54 entry: 24 cur_tx: 2141912
>>> frags: 0 dirty_tx: 2141911 tx_left: 1
>>> 1923136: hping3-2871 [001] .... 3253.596227:
>>> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 54 entry: 50
>>> cur_tx: 2142066 frags: 0 dirty_tx: 2142066 tx_left: 0
>>> 1923154: <idle>-0 [000] ..s. 3253.596289:
>>> rtl8169_poll: ORP33 eth1 rtl_tx len: 54 entry: 51 cur_tx: 2142067
>>> frags: 0 dirty_tx: 2142066 tx_left: 1
>>> 1951633: hping3-2871 [001] .... 3254.596459:
>>> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 54 entry: 5
>>> cur_tx: 2142213 frags: 0 dirty_tx: 2142213 tx_left: 0
>>> 1951651: <idle>-0 [000] ..s. 3254.596521:
>>> rtl8169_poll: ORP33 eth1 rtl_tx len: 54 entry: 6 cur_tx: 2142214
>>> frags: 0 dirty_tx: 2142213 tx_left: 1
>>> 1954472: hping3-2871 [001] .... 3255.596700:
>>> rtl8169_start_xmit: ORP33 eth1 rtl8169_start_xmit len: 54 entry: 15
>>> cur_tx: 2142223 frags: 0 dirty_tx: 2142223 tx_left: 0
>>> 1954490: <idle>-0 [000] ..s. 3255.596763:
>>> rtl8169_poll: ORP33 eth1 rtl_tx len: 54 entry: 16 cur_tx: 2142224
>>> frags: 0 dirty_tx: 2142223 tx_left: 1
>>> //until it soon, again is slow..
>>>
>>>
>>> BR.
>>> Risto
>>>
>>>
>>>
>>> On 7.12.2018 16:46, Risto Pajula wrote:
>>>> Hello.
>>>>
>>>> I have been to track the poor forwarding latency to the TCP Window
>>>> scale options. The Netgem device uses rather large windows scale
>>>> options (x256) and I have been able to reproduce the routers poor
>>>> forwarding latency also with linux box running in the internal
>>>> network and changing the net.ipv4.tcp_rmem to a large value and
>>>> thus changing the TCP window scaling options to larger ones. I
>>>> still do not have clue why this causes the forwarfing in the linux
>>>> kernel to block? Maybe something in the connection tracking....?
>>>>
>>>>
>>>> With the ICMP timestamp messages I have been able to also pinpoint
>>>> that the latency is caused in the eth1 sending side (the following
>>>> hping3 example is run in the router toward the internal network...
>>>>
>>>>
>>>> xxx:/usr/src/linux-4.20-rc2 # hping3 192.168.0.112 --icmp --icmp-ts -V
>>>> using eth1, addr: 192.168.0.1, MTU: 1500
>>>> HPING 192.168.0.112 (eth1 192.168.0.112): icmp mode set, 28 headers
>>>> + 0 data bytes
>>>> len=46 ip=192.168.0.112 ttl=64 id=49464 tos=0 iplen=40
>>>> icmp_seq=0 rtt=7.9 ms
>>>> ICMP timestamp: Originate=52294891 Receive=52294895 Transmit=52294895
>>>> ICMP timestamp RTT tsrtt=7
>>>>
>>>> len=46 ip=192.168.0.112 ttl=64 id=49795 tos=0 iplen=40
>>>> icmp_seq=1 rtt=235.9 ms
>>>> ICMP timestamp: Originate=52295891 Receive=52296128 Transmit=52296128
>>>> ICMP timestamp RTT tsrtt=235
>>>>
>>>> len=46 ip=192.168.0.112 ttl=64 id=49941 tos=0 iplen=40
>>>> icmp_seq=2 rtt=3.8 ms
>>>> ICMP timestamp: Originate=52296891 Receive=52296895 Transmit=52296895
>>>> ICMP timestamp RTT tsrtt=3
>>>>
>>>> len=46 ip=192.168.0.112 ttl=64 id=50685 tos=0 iplen=40
>>>> icmp_seq=3 rtt=47.8 ms
>>>> ICMP timestamp: Originate=52297891 Receive=52297940 Transmit=52297940
>>>> ICMP timestamp RTT tsrtt=47
>>>>
>>>> len=46 ip=192.168.0.112 ttl=64 id=51266 tos=0 iplen=40
>>>> icmp_seq=4 rtt=7.7 ms
>>>> ICMP timestamp: Originate=52298891 Receive=52298895 Transmit=52298895
>>>> ICMP timestamp RTT tsrtt=7
>>>>
>>>> len=46 ip=192.168.0.112 ttl=64 id=52245 tos=0 iplen=40
>>>> icmp_seq=5 rtt=3.7 ms
>>>> ICMP timestamp: Originate=52299891 Receive=52299895 Transmit=52299895
>>>> ICMP timestamp RTT tsrtt=3
>>>>
>>>> ^C
>>>> --- 192.168.0.112 hping statistic ---
>>>> 6 packets tramitted, 6 packets received, 0% packet loss
>>>> round-trip min/avg/max = 3.7/51.1/235.9 ms
>>>>
>>>>
>>>>
>>>> BR.
>>>> Risto
>>>>
>>>> On 2.12.2018 23:32, Risto Pajula wrote:
>>>>> Hello.
>>>>>
>>>>> You can most likely ignore the "DF Bit, mtu bug when forwarding"
>>>>> case. There isn't actually big IP packets on the wire, instead
>>>>> there is burst of packets on the wire, which are combined by the
>>>>> GRO... And thus dropping them should not happen. Sorry about the
>>>>> invalid bug report.
>>>>>
>>>>> However the poor latency from intenal network to the internet
>>>>> still remain, both GRO enabled and disabled. I will try to study
>>>>> further...
>>>>>
>>>>>
>>>>> BR.
>>>>> Risto
>>>>>
>>>>>
>>>>> On 2.12.2018 14:01, Risto Pajula wrote:
>>>>>> Hello.
>>>>>>
>>>>>> I have encountered a weird performance problem in Linux IP
>>>>>> fragmentation when using video streaming services behind the NAT.
>>>>>> Also I have studied a possible bug in the DF bit (don't fragment)
>>>>>> handling when forwarding the IP packets.
>>>>>>
>>>>>> First the system setup description:
>>>>>>
>>>>>> [host1]-int lan-(eth1)[linux router](eth0)-extlan-[fibre
>>>>>> router]-internet
>>>>>>
>>>>>> where:
>>>>>> host1: is a Netgem N7800 "cable box" for online video streaming
>>>>>> services provided by local telco (Can access Netflix, HBO nordic,
>>>>>> "live TV", etc.)
>>>>>> linux router: Linux computer with Dualcore Intel Celeron G1840,
>>>>>> running currently Linux kernel 4.20.0-rc2, and openSUSE Leap 15.0
>>>>>> eth1: Linux Routers internal (NAT) interface, 192.168.0.1/24
>>>>>> network, mtu set to 1500, RTL8169sb/8110sb
>>>>>> eth0: Linux Routers internet facing interface, public ip address,
>>>>>> mtu set to 1500, RTL8168evl/8111evl
>>>>>> fibre router: Alcatel Lucent fibre router (I-241G-Q), directly
>>>>>> connected to the eth0 of the Linux router.
>>>>>>
>>>>>> And now when using the Netgem N7800 with online video services
>>>>>> (Netflix, HBO nordic, etc) the Linux router will receive very BIG
>>>>>> IP packets in the eth0 upto ~20kB, this seems to lead to the
>>>>>> following problems in the Linux IP stack.
>>>>>>
>>>>>> IP fragmentation performance:
>>>>>> When the Linux router receives these large IP packets in the eth0
>>>>>> everything works, but it seems that them cause very large
>>>>>> performance degradation from internal network to the internet
>>>>>> regarding the latency when the IP fragmentation is performed. The
>>>>>> ping latency from internal network to the internel network
>>>>>> increases from stable 15ms-20ms up to 700-800ms AND also the ping
>>>>>> from the internal network to the linux router eth1 (192.168.0.).
>>>>>> However up link works perfectly, the ping is still stable when
>>>>>> streaming the online services (From linux router to the
>>>>>> internet). It seems that the IP fragmentation is somehow blocking
>>>>>> the eth1 reception or transmission for very long time (which it
>>>>>> shouldn't). I'm able to test and debug the issue further, but
>>>>>> advice regarding where to look would be appreciated.
>>>>>>
>>>>>>
>>>>>> DF Bit, mtu bug when forwarding:
>>>>>> I have started to study the above mentioned problem and have
>>>>>> found a possible bug in the DF bit and mtu handling in IP
>>>>>> forwarding. The BIG packets received from streaming services all
>>>>>> have the "DF bit" set and the question is that should we be
>>>>>> forwarding them at all as that would result them being
>>>>>> fragmented? Apparently we currently are... I have traced this
>>>>>> down to the ip_forward.c function ip_exceeds_mtu(), and the
>>>>>> following patch seems to fix that.
>>>>>>
>>>>>> --- net/ipv4/ip_forward.c.orig 2018-12-02 11:09:32.764320780 +0200
>>>>>> +++ net/ipv4/ip_forward.c 2018-12-02 12:53:25.031232347 +0200
>>>>>> @@ -49,7 +49,7 @@ static bool ip_exceeds_mtu(const struct
>>>>>> return false;
>>>>>>
>>>>>> /* original fragment exceeds mtu and DF is set */
>>>>>> - if (unlikely(IPCB(skb)->frag_max_size > mtu))
>>>>>> + if (unlikely(skb->len > mtu))
>>>>>> return true;
>>>>>>
>>>>>> if (skb->ignore_df)
>>>>>>
>>>>>>
>>>>>> This seems to work (in some ways) - after the change IP packets
>>>>>> that are too large to the internal network get dropped and we are
>>>>>> sending "ICMP Destination unreachable, The datagram is too big"
>>>>>> messages to the originator (as we should?). However it seems that
>>>>>> not all services really like this... Netflix behaves as expected
>>>>>> and ping is stable from internal network to the internet, but for
>>>>>> example HBO nordic will not work anymore (too little buffering?
>>>>>> Retransimissions not working?). So it seems the original issue
>>>>>> should be also fixed (And the fragmention should be allowed?).
>>>>>>
>>>>>>
>>>>>>
>>>>>> Any advice would be appreciated. Thanks!
>>>>>>
>>>>>> PS. Watching TV was not this intensive 20 years ago :)
>>>>>>
Powered by blists - more mailing lists