lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 14 Jun 2024 12:24:09 -0400
From: Neal Cardwell <ncardwell@...gle.com>
To: Neil Hellfeldt <hellfeldt@...eem.com>
Cc: netdev@...r.kernel.org, Eric Dumazet <edumazet@...gle.com>, 
	Yuchung Cheng <ycheng@...gle.com>
Subject: Re: Throughtput regression due to [v2,net,2/2] tcp: fix delayed ACKs
 for MSS boundary condition

On Fri, Jun 14, 2024 at 9:50 AM Neil Hellfeldt <hellfeldt@...eem.com> wrote:
>
> Hi,
>
> So I believe I found a regression due to:
> patch: [v2,net,2/2] tcp: fix delayed ACKs for MSS boundary condition
> commit: 4720852ed9afb1c5ab84e96135cb5b73d5afde6f
>
> I recently upgraded our production machines from Ubuntu 16.04 all the
> way up to 24.04.
>
> In the process I noticed that iperf3 was no longer able to get the
> throughput that it was able to on 16.04
> I found that Ubuntu 22.04 is when it broke. Then I found that Ubuntu's
> kernel version 5.15.0-92 worked
> fine and version 5.15.0-93 did not. After that I narrowed it down to the
> patch:
>
> patch: [v2,net,2/2] tcp: fix delayed ACKs for MSS boundary condition
> commit: 4720852ed9afb1c5ab84e96135cb5b73d5afde6f
>
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 06fe1cf645d5a..8afb0950a6979 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -253,6 +253,19 @@  static void tcp_measure_rcv_mss(struct sock *sk, const struct sk_buff *skb)
>                 if (unlikely(len > icsk->icsk_ack.rcv_mss +
>                                    MAX_TCP_OPTION_SPACE))
>                         tcp_gro_dev_warn(sk, skb, len);
> + /* If the skb has a len of exactly 1*MSS and has the PSH bit
> + * set then it is likely the end of an application write. So
> + * more data may not be arriving soon, and yet the data sender
> + * may be waiting for an ACK if cwnd-bound or using TX zero
> + * copy. So we set ICSK_ACK_PUSHED here so that
> + * tcp_cleanup_rbuf() will send an ACK immediately if the app
> + * reads all of the data and is not ping-pong. If len > MSS
> + * then this logic does not matter (and does not hurt) because
> + * tcp_cleanup_rbuf() will always ACK immediately if the app
> + * reads data and there is more than an MSS of unACKed data.
> + */
> + if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_PSH)
> + icsk->icsk_ack.pending |= ICSK_ACK_PUSHED;
>         } else {
>                 /* Otherwise, we make more careful check taking into account,
>                  * that SACKs block is variable.
>
>
> After I removed the patches I was able to get the expected speeds. I then reverted the patched on most current
> version of the kernel for Ubuntu which is 6.8.0-35 and I was able to get the expected speeds again.
>
> The device unit under test is a embedded device with lwip and built in iperf3 server. It has 100mbit network port.
> It is also a low data rate wireless radio. The expected rate for the Ethernet is ~52000Kbps avg with the patch applied
> it was getting ~38000Kbps. The expected throughput for wireless is ~328kbps with the patch applied were are getting ~292Kbps.
> That a ~27% throughput regression for Ethernet and a ~11% for the wireless.
>
> command used iperf3 -c 172.18.8.134 -P 1 -i 1 -f m -t 10 -4 -w 64K -R
> Iperf version is 3.0.11 from Ubuntu. The newer version of iperf3 3.16 show the correct speeds but shows a saw tooth plot for the wireless.

Thanks for the report!

AFAICT your email did not specify the direction of data transfer, but
I gather that you are talking about a throughput regression when the
embedded device is the TCP sender and the Ubuntu machine is the
receiver?

Can you please gather some traces on the receiver Ubuntu machine, as
root, and share the results?

Something like:

tcpdump -w /root/tcpdump.pcap -n -s 116 -c 1000000 -i $eth_device &
nstat -n; (while true; do date; nstat; sleep 0.5; done)  > /root/nstat.txt &
# run test ...
kill %1
kill %2

Ideally it would be great if you could provide those traces for (a)
the fast case, and then also (b) the slow case, so we can compare.

Thanks!
neal

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ