netdev - Re: [RFC PATCH net-next 1/2] tcp: RTO Restart (RTOR)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20151207164623.GA22976@mrl.redhat.com>
Date:	Mon, 7 Dec 2015 14:46:23 -0200
From:	Marcelo Ricardo Leitner <marcelo.leitner@...il.com>
To:	Per Hurtig <per.hurtig@....se>
Cc:	davem@...emloft.net, edumazet@...gle.com, ncardwell@...gle.com,
	nanditad@...gle.com, tom@...bertland.com, ycheng@...gle.com,
	viro@...iv.linux.org.uk, fw@...len.de, daniel@...earbox.net,
	willemb@...gle.com, ilpo.jarvinen@...sinki.fi,
	pasi.sarolahti@....fi, stephen@...workplumber.org,
	netdev@...r.kernel.org, anna.brunstrom@....se, apetlund@...ula.no,
	michawe@....uio.no, mohammad.rajiullah@....se
Subject: Re: [RFC PATCH net-next 1/2] tcp: RTO Restart (RTOR)

On Mon, Dec 07, 2015 at 10:00:11AM +0100, Per Hurtig wrote:
> This patch implements the RTO restart modification (RTOR). When data is
> ACKed, and the RTO timer is restarted, the time elapsed since the last
> outstanding segment was transmitted is subtracted from the calculated RTO
> value. This way, the RTO timer will expire after exactly RTO seconds, and
> not RTO + RTT [+ delACK] seconds.
> 
> This patch also implements a new sysctl (tcp_timer_restart) that is used
> to control the timer restart behavior.
> 
> Signed-off-by: Per Hurtig <per.hurtig@....se>
> ---
>  Documentation/networking/ip-sysctl.txt | 12 ++++++++++++
>  include/net/tcp.h                      |  4 ++++
>  net/ipv4/sysctl_net_ipv4.c             | 10 ++++++++++
>  net/ipv4/tcp_input.c                   | 24 ++++++++++++++++++++++++
>  4 files changed, 50 insertions(+)
> 
> diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt
> index 2ea4c45..4094128 100644
> --- a/Documentation/networking/ip-sysctl.txt
> +++ b/Documentation/networking/ip-sysctl.txt

(snip)

> @@ -2997,6 +2998,18 @@ static void tcp_cong_avoid(struct sock *sk, u32 ack, u32 acked)
>  	tcp_sk(sk)->snd_cwnd_stamp = tcp_time_stamp;
>  }
>  
> +static u32 tcp_unsent_pkts(const struct sock *sk)
> +{
> +	struct sk_buff *skb = tcp_send_head(sk);
> +	u32 pkts = 0;
> +
> +	if (skb)
> +		tcp_for_write_queue_from(skb, sk)
> +			pkts += tcp_skb_pcount(skb);
> +
> +	return pkts;
> +}
> +
>  /* Restart timer after forward progress on connection.
>   * RFC2988 recommends to restart timer to now+rto.
>   */
> @@ -3027,6 +3040,17 @@ void tcp_rearm_rto(struct sock *sk)
>  			 */
>  			if (delta > 0)
>  				rto = delta;
> +		} else if (icsk->icsk_pending == ICSK_TIME_RETRANS &&
> +			   (sysctl_tcp_timer_restart == 1 ||
> +			    sysctl_tcp_timer_restart == 3) &&
> +			   (tp->packets_out + tcp_unsent_pkts(sk) <
> +			    TCP_RTORESTART_THRESH)) {

(snip)

By when this gets hit, you could have a big write queue.
What about wrapping at least this this condition 
tp->packets_out + tcp_unsent_pkts(sk) < TCP_RTORESTART_THRESH
in its own check function? Like:

+static bool tcp_can_rtor(const struct sock *sk)
+{
+	struct sk_buff *skb = tcp_send_head(sk);
+	s32 target = TCP_RTORESTART_THRESH - tp->packets_out;
+
+	if (target <= 0)
+		return false;
+
+	if (skb) {
+		tcp_for_write_queue_from(skb, sk) {
+			target -= tcp_skb_pcount(skb);
+			if (target <= 0)
+				return false;
+		}
+	}
+
+	return true;
+}

This way it will only traverse what is needed for the check itself.

  Marcelo

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html