lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1418135209.14835.17.camel@edumazet-glaptop2.roam.corp.google.com>
Date:	Tue, 09 Dec 2014 06:26:49 -0800
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Thomas Jarosch <thomas.jarosch@...ra2net.com>
Cc:	Wolfgang Walter <linux@...m.de>, netdev@...r.kernel.org,
	Eric Dumazet <edumazet@...gle.com>,
	Herbert Xu <herbert@...dor.apana.org.au>,
	Steffen Klassert <steffen.klassert@...unet.com>
Subject: Re: [bisected] xfrm: TCP connection initiating PMTU discovery
 stalls on v3.

On Tue, 2014-12-09 at 09:54 +0100, Thomas Jarosch wrote:
> On Monday, 8. December 2014 23:20:42 Wolfgang Walter wrote:
> > Am Freitag, 5. Dezember 2014, 05:26:25 schrieb Eric Dumazet:
> > > On Fri, 2014-12-05 at 13:09 +0100, Wolfgang Walter wrote:
> > > > Hello,
> > > > 
> > > > as reverting this patch fixes this rather annoying problem: is it
> > > > dangerous to revert it as a workaround until the root cause is found?
> > > 
> > > Unfortunately no, this patch fixes a serious issue.
> > > 
> > > We need to find the root cause of your problem instead of trying to work
> > > around it.
> > 
> > I only wanted to use it as local workaround here.
> > 
> > 
> > I looked a bit at at code. I'm not familiar with the network code, though
> > :-).
> 
> If it helps, I'm running the reverted patch on five production boxes hitherto 
> without a hiccup. As far as I understood the original commit message,
> some packet counters might me wrong without it.
> 
> @Eric: What could possibly go wrong(tm)? :)

Crashes in TCP stack, because of packet count mismatches.

The sk_can_gso() status is already tested in tcp_sendmsg() as a hint,
since path behavior can dynamically be changed on existing flow :

<start a TCP flow>
ethtool -K eth0 tso off gso off

In this case, core networking stack detects this and segments the
packets _after_ TCP or IP stack, before they reach eth0.

TCP stack does not have to know that something is changed right before
giving a GSO packet to core networking stack, this would be racy by
nature, as TCP does not know or control full path. Hopefully we do not
take RTNL for every packet we send in TCP !

It seems XFRM triggers in a slow path something which is not correctly
handled.

It is not correct to add a racy kludge in TCP fast path for this very
unlikely case.

I would disable TSO/GSO on xfrm, and problem should disappear.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ