lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Sat, 12 Dec 2020 10:34:14 -0800 From: Yuchung Cheng <ycheng@...gle.com> To: Alexander Duyck <alexander.duyck@...il.com> Cc: Eric Dumazet <edumazet@...gle.com>, David Miller <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>, Alexey Kuznetsov <kuznet@....inr.ac.ru>, Hideaki YOSHIFUJI <yoshfuji@...ux-ipv6.org>, netdev <netdev@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org>, Martin Lau <kafai@...com>, Kernel Team <kernel-team@...com> Subject: Re: [net-next PATCH] tcp: Add logic to check for SYN w/ data in tcp_simple_retransmit On Fri, Dec 11, 2020 at 5:28 PM Alexander Duyck <alexander.duyck@...il.com> wrote: > > From: Alexander Duyck <alexanderduyck@...com> > > There are cases where a fastopen SYN may trigger either a ICMP_TOOBIG > message in the case of IPv6 or a fragmentation request in the case of > IPv4. This results in the socket stalling for a second or more as it does > not respond to the message by retransmitting the SYN frame. > > Normally a SYN frame should not be able to trigger a ICMP_TOOBIG or > ICMP_FRAG_NEEDED however in the case of fastopen we can have a frame that > makes use of the entire MSS. In the case of fastopen it does, and an > additional complication is that the retransmit queue doesn't contain the > original frames. As a result when tcp_simple_retransmit is called and > walks the list of frames in the queue it may not mark the frames as lost > because both the SYN and the data packet each individually are smaller than > the MSS size after the adjustment. This results in the socket being stalled > until the retransmit timer kicks in and forces the SYN frame out again > without the data attached. > > In order to resolve this we can generate our best estimate for the original > packet size by detecting the fastopen SYN frame and then adding the > overhead for MAX_TCP_OPTION_SPACE and verifying if the SYN w/ data would > have exceeded the MSS. If so we can mark the frame as lost and retransmit > it. > > Signed-off-by: Alexander Duyck <alexanderduyck@...com> > --- > net/ipv4/tcp_input.c | 30 +++++++++++++++++++++++++++--- > 1 file changed, 27 insertions(+), 3 deletions(-) > > diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c > index 9e8a6c1aa019..79375b58de84 100644 > --- a/net/ipv4/tcp_input.c > +++ b/net/ipv4/tcp_input.c > @@ -2686,11 +2686,35 @@ static void tcp_mtup_probe_success(struct sock *sk) > void tcp_simple_retransmit(struct sock *sk) > { > const struct inet_connection_sock *icsk = inet_csk(sk); > + struct sk_buff *skb = tcp_rtx_queue_head(sk); > struct tcp_sock *tp = tcp_sk(sk); > - struct sk_buff *skb; > - unsigned int mss = tcp_current_mss(sk); > + unsigned int mss; > + > + /* A fastopen SYN request is stored as two separate packets within > + * the retransmit queue, this is done by tcp_send_syn_data(). > + * As a result simply checking the MSS of the frames in the queue > + * will not work for the SYN packet. So instead we must make a best > + * effort attempt by validating the data frame with the mss size > + * that would be computed now by tcp_send_syn_data and comparing > + * that against the data frame that would have been included with > + * the SYN. > + */ > + if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_SYN && tp->syn_data) { > + struct sk_buff *syn_data = skb_rb_next(skb); > + > + mss = tcp_mtu_to_mss(sk, icsk->icsk_pmtu_cookie) + > + tp->tcp_header_len - sizeof(struct tcphdr) - > + MAX_TCP_OPTION_SPACE; nice comment! The original syn_data mss needs to be inferred which is a hassle to get right. my sense is path-mtu issue is enough to warrant they are lost. I suggest simply mark syn & its data lost if tcp_simple_retransmit is called during TFO handshake, i.e. diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 62f7aabc7920..7f0c4f2947eb 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -2864,7 +2864,8 @@ void tcp_simple_retransmit(struct sock *sk) unsigned int mss = tcp_current_mss(sk); skb_rbtree_walk(skb, &sk->tcp_rtx_queue) { - if (tcp_skb_seglen(skb) > mss) + if (tcp_skb_seglen(skb) > mss || + (tp->syn_data && sk->sk_state == TCP_SYN_SENT)) tcp_mark_skb_lost(sk, skb); } We have a TFO packetdrill test that verifies my suggested fix should trigger an immediate retransmit vs 1s wait. > > - skb_rbtree_walk(skb, &sk->tcp_rtx_queue) { > + if (syn_data && syn_data->len > mss) > + tcp_mark_skb_lost(sk, skb); > + > + skb = syn_data; > + } else { > + mss = tcp_current_mss(sk); > + } > + > + skb_rbtree_walk_from(skb) { > if (tcp_skb_seglen(skb) > mss) > tcp_mark_skb_lost(sk, skb); > } > >
Powered by blists - more mailing lists