lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Fri, 05 Dec 2014 13:09:59 +0100 From: Wolfgang Walter <linux@...m.de> To: netdev@...r.kernel.org Cc: Thomas Jarosch <thomas.jarosch@...ra2net.com>, Eric Dumazet <edumazet@...gle.com>, Herbert Xu <herbert@...dor.apana.org.au>, Steffen Klassert <steffen.klassert@...unet.com> Subject: Re: [bisected] xfrm: TCP connection initiating PMTU discovery stalls on v3.12+ Hello, as reverting this patch fixes this rather annoying problem: is it dangerous to revert it as a workaround until the root cause is found? Am Montag, 1. Dezember 2014, 17:41:23 schrieb Wolfgang Walter: > Am Montag, 1. Dezember 2014, 14:17:28 schrieb Wolfgang Walter: > > Am Samstag, 29. November 2014, 12:44:07 schrieb Thomas Jarosch: > > > Hello, > > > > > > we're in the process of updating production level machines > > > from kernel 3.4.101 to kernel 3.14.25. On one mail server > > > we noticed that emails destined for an IPSec tunnel sometimes > > > get stuck in the mail queue with TCP timeouts. > > > > > > To make a long story short: When the VPN connection is initially > > > set up or re-newed, the path MTU for the xfrm tunnel is undetermined. > > > > > > As soon as a TCP client starts to send large packets, > > > it triggers path MTU detection. Some middlebox on the > > > way to the final server has a lower MTU and sends back > > > an "ICMP fragmentation needed" packet as normal. > > > > > > With the old kernel, the packet size for the TCP connection inside > > > the xfrm tunnel gets adjusted and all is fine. With kernel v3.12+, > > > the connection stalls completely. Same thing with kernel v3.18-rc6. > > > > We see something similar with real nic (RTL8139). In our case only the > > first tcp-connection which triggers PMTU stalls. Later tcp-connections > > then work fine. > > > > I will revert that patch and see if that fixes the problem. > > Reverting the commit fixes the problem here, too. > > > > We wrote a small tool to mimic postfix's TCP behavior (see attached > > > file). > > > In the end it's a normal TCP client sending large packets. > > > The server side is just "socat - tcp4-listen:667". > > > > > > If you run "socket_client" a second time, the path MTU > > > for the xfrm tunnel is already known and packets flow normal, too. > > > > > > > > > The "evil" commit in question is this one: > > > --------------------------------------------------------------------- > > > commit 8f26fb1c1ed81c33f5d87c5936f4d9d1b4118918 > > > Author: Eric Dumazet <edumazet@...gle.com> > > > Date: Tue Oct 15 12:24:54 2013 -0700 > > > > > > tcp: remove the sk_can_gso() check from tcp_set_skb_tso_segs() > > > > > > sk_can_gso() should only be used as a hint in tcp_sendmsg() to build > > > GSO > > > > > > packets in the first place. (As a performance hint) > > > > > > Once we have GSO packets in write queue, we can not decide they are > > > no > > > longer GSO only because flow now uses a route which doesn't handle > > > TSO/GSO. > > > > > > Core networking stack handles the case very well for us, all we need > > > is keeping track of packet counts in MSS terms, regardless of > > > segmentation done later (in GSO or hardware) > > > > > > Right now, if tcp_fragment() splits a GSO packet in two parts, > > > @left and @right, and route changed through a non GSO device, > > > both @left and @right have pcount set to 1, which is wrong, > > > and leads to incorrect packet_count tracking. > > > > > > This problem was added in commit d5ac99a648 ("[TCP]: skb pcount with > > > MTU > > > > > > discovery") > > > > > > Signed-off-by: Eric Dumazet <edumazet@...gle.com> > > > Signed-off-by: Neal Cardwell <ncardwell@...gle.com> > > > Signed-off-by: Yuchung Cheng <ycheng@...gle.com> > > > Reported-by: Maciej Żenczykowski <maze@...gle.com> > > > Signed-off-by: David S. Miller <davem@...emloft.net> > > > > > > diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c > > > index 8fad1c1..d46f214 100644 > > > --- a/net/ipv4/tcp_output.c > > > +++ b/net/ipv4/tcp_output.c > > > @@ -989,8 +989,7 @@ static void tcp_set_skb_tso_segs(const struct sock > > > *sk, > > > struct sk_buff *skb, /* Make sure we own this skb before messing > > > gso_size/gso_segs */ WARN_ON_ONCE(skb_cloned(skb)); > > > > > > - if (skb->len <= mss_now || !sk_can_gso(sk) || > > > - skb->ip_summed == CHECKSUM_NONE) { > > > + if (skb->len <= mss_now || skb->ip_summed == CHECKSUM_NONE) { > > > > > > /* Avoid the costly divide in the normal > > > > > > * non-TSO case. > > > */ > > > > > > --------------------------------------------------------------------- > > > > > > When I revert it, even kernel v3.18-rc6 starts working. > > > But I doubt this is the root problem, may be just hiding another issue. > > > > > > --- Sample output of socket_client using vanilla v3.12 kernel --- > > > [1417258063 SEND result: 4096, strerror: Success] > > > tcp max seg: res: 0, max_seg: 1370 > > > [1417258063 SEND result: 4096, strerror: Success] > > > tcp max seg: res: 0, max_seg: 1370 > > > [1417258063 SEND result: 4096, strerror: Success] > > > tcp max seg: res: 0, max_seg: 1370 > > > [1417258063 SEND result: 4096, strerror: Success] > > > tcp max seg: res: 0, max_seg: 1370 > > > [1417258063 SEND result: 4096, strerror: Success] > > > tcp max seg: res: 0, max_seg: 1338 > > > [1417258063 SEND result: 4096, strerror: Success] > > > tcp max seg: res: 0, max_seg: 1338 > > > *STUCK* > > > -------------------------------------------------------- > > > > > > The "machine" is running on KVM and using "virtio_net" as NIC driver. > > > I've played with the ethtool offload settings: > > > > > > *** eth1 defaults *** > > > Offload parameters for eth1: > > > rx-checksumming: on > > > tx-checksumming: on > > > scatter-gather: on > > > tcp-segmentation-offload: on > > > udp-fragmentation-offload: on > > > generic-segmentation-offload: on > > > generic-receive-offload: on > > > large-receive-offload: off > > > > > > *** eth1 working (no stalls) using vanilla kernel *** > > > Offload parameters for eth1: > > > rx-checksumming: on > > > tx-checksumming: off <-- the magic switch > > > scatter-gather: off > > > tcp-segmentation-offload: off > > > udp-fragmentation-offload: off > > > generic-segmentation-offload: off > > > generic-receive-offload: off > > > large-receive-offload: off > > > > > > When I turn "tx-checksumming" back on, it fails again. > > > Though that is probably also just a side effect. > > > > > > I can provide tcpdumps if needed but they are no real help > > > since you can just see the kernel stops sending TCP packets. > > > (and the outgoing TCP packets are encrypted in ESP packets) > > > > > > > > > Any vague idea what might be the root cause? > > > > > > I also tried reverting commit 4d53eff48b5f03ce67f4f301d6acca1d2145cb7a > > > ("xfrm: Don't queue retransmitted packets if the original is still on > > > the > > > host") but that didn't change the situation. In fact it wasn't even > > > triggered. > > > > > > Please CC: comments. Thanks. > > > > > > Best regards, > > > Thomas > > > > Regards, > > Regards, Regards, -- Wolfgang Walter Studentenwerk München Anstalt des öffentlichen Rechts -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists