lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <F8B81F3F-4E21-4468-8963-2A762AF9A7CC@gmail.com>
Date:	Fri, 27 Feb 2015 23:33:37 +0900
From:	Jaedon Shin <jaedon.shin@...il.com>
To:	David Miller <davem@...emloft.net>
Cc:	Florian Fainelli <f.fainelli@...il.com>, netdev@...r.kernel.org,
	eric.dumazet@...il.com
Subject: Re: [PATCH] net: bcmgenet: fix throughtput regression with TSO autosizing

> On Feb 27, 2015, at 1:36 AM, David Miller <davem@...emloft.net> wrote:
> 
> From: Jaedon Shin <jaedon.shin@...il.com>
> Date: Thu, 26 Feb 2015 20:05:58 +0900
> 
>> This patch prevents the performance degradation of xmit after
>> 605ad7f ("tcp: refine TSO autosizing").
>> 
>> Signed-off-by: Jaedon Shin <jaedon.shin@...il.com>
> 
> I doubt this is the correct way to fix this.
> 
> Also, you need to describe in detail what the actual problem
> is, how you evaluated the cause, and what made you think that
> your choosen solution is the proper one.

The bcmgenet_tx_reclaim() of tx_ring[{0,1,2,3}] process only under 18 
descriptors is too late reclaiming transmitted skb. Therefore, 
performance degradation of xmit after 605ad7f ("tcp: refine TSO autosizing").

# iperf -c 172.16.1.2 -i 1 
------------------------------------------------------------
Client connecting to 172.16.1.20, TCP port 5001
TCP window size: 45.0 KByte (default)
------------------------------------------------------------
[  3] local 172.16.1.101 port 44020 connected with 172.16.1.20 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 1.0 sec   512 KBytes  4.19 Mbits/sec
[  3]  1.0- 2.0 sec   384 KBytes  3.15 Mbits/sec
[  3]  2.0- 3.0 sec  0.00 Bytes  0.00 bits/sec
[  3]  3.0- 4.0 sec  0.00 Bytes  0.00 bits/sec
[  3]  4.0- 5.0 sec   128 KBytes  1.05 Mbits/sec
[  3]  5.0- 6.0 sec  0.00 Bytes  0.00 bits/sec
[  3]  6.0- 7.0 sec  5.75 MBytes  48.2 Mbits/sec
[  3]  7.0- 8.0 sec  11.1 MBytes  93.3 Mbits/sec
[  3]  8.0- 9.0 sec  11.2 MBytes  94.4 Mbits/sec
[  3]  9.0-10.0 sec  11.1 MBytes  93.3 Mbits/sec
[  3]  0.0-10.0 sec  40.5 MBytes  33.9 Mbits/sec

Considered as a way to avoid the two.

1. Using skb_orphan found previous mailing. This is reporting finish 
of xmit immediately.

2. Using bcmgenet_tx_reclaim_all instead of bcmgenet_tx_reclaim in 
bcmgenet_poll. But, it is not suitable to use because of bcmgenet_poll 
is connected with bcmgenet_isr0.

So, I had to use the shortest way. the result is as follows:

# iperf -c 172.16.1.20 -i 1
------------------------------------------------------------
Client connecting to 172.16.1.20, TCP port 5001
TCP window size: 45.0 KByte (default)
------------------------------------------------------------
[  3] local 172.16.1.101 port 50763 connected with 172.16.1.20 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0- 1.0 sec  11.2 MBytes  94.4 Mbits/sec
[  3]  1.0- 2.0 sec  11.2 MBytes  94.4 Mbits/sec
[  3]  2.0- 3.0 sec  11.1 MBytes  93.3 Mbits/sec
[  3]  3.0- 4.0 sec  11.2 MBytes  94.4 Mbits/sec
[  3]  4.0- 5.0 sec  11.1 MBytes  93.3 Mbits/sec
[  3]  5.0- 6.0 sec  11.2 MBytes  94.4 Mbits/sec
[  3]  6.0- 7.0 sec  11.1 MBytes  93.3 Mbits/sec
[  3]  7.0- 8.0 sec  11.2 MBytes  94.4 Mbits/sec
[  3]  8.0- 9.0 sec  11.2 MBytes  94.4 Mbits/sec
[  3]  9.0-10.0 sec  11.1 MBytes  93.3 Mbits/sec
[  3]  0.0-10.0 sec   112 MBytes  94.0 Mbits/sec

> Hmpff...
> 
> Can you elaborate on the regression ?
> 
> Is it because NIC delays TX completion irq or something ?
> 
> bcmgenet_poll() only drains TX packets on ring16 (DESC_INDEC)
> 
> Other tx completions seem to run from hard irq context (bcmgenet_isr1())
> 
> Your patch seems to imply a bug in hard irq signaling/handling.

Good point.

The use of skb_orphan is in hard irq. I had mistaken using hard irq 
in tx_ring[{0,1,2,3}].

The patch of v2 is to use tx_poll like bcmsysport. 
That too can get the good results.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ