lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 9 Jun 2014 12:49:40 +0000
From:	David Laight <David.Laight@...LAB.COM>
To:	"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: SCTP seems to lose its socket state.

I think I have now reproduced the problem.

> From: David Laight
> > I've been looking at an ethernet trace from one of our customers.
> > They seem to have got an SCTP socket into a rather confused state.
> >
> > There seem to be a significant number of transmit ethernet frames
> > that don't read the far end.
> > This shouldn't cause a real problem, but we end up with the following:
> > This trace was taken on the linux system:
> >
> > 39964   0.304473        ->      SCTP    INIT
> > 39965   0.292669        <-      SCTP    INIT  (I think this has an invalid checksum)
> > 39968   0.467935        <-      SCTP    INIT
> > 39969   0.000093        ->      SCTP    INIT_ACK
> > 39970   0.003947        <-      SCTP    COOKIE_ECHO
> > 39971   0.000072        ->      SCTP    COOKIE_ACK
> > 39972   0.000337        ->      M3UA    ASPUP
> > 39979   0.809659        <-      SCTP    COOKIE_ECHO
> > 39980   0.000058        ->      SCTP    COOKIE_ACK
> > shutdown() called here - seems to be ignored
> > 39983   0.949471        <-      SCTP    COOKIE_ECHO
> > 39984   0.000053        ->      SCTP    COOKIE_ACK
> > 39986   0.730072        ->      M3UA    ASPUP           Same TSN as above
> > 40002   0.270589        ->      M3UA    ASPUP           Same TSN as above
> > 40008   3.689088        <-      SCTP    HEARTBEAT
> > 40009   0.000027        ->      SCTP    HEARTBEAT_ACK
> > 40014   0.261152        <-      SCTP    HEARTBEAT
> > 40015   0.000033        ->      SCTP    HEARTBEAT_ACK
> > 40026   0.123048        <-      SCTP    HEARTBEAT
> > 40027   0.000030        ->      SCTP    HEARTBEAT_ACK
> > 40036   1.615048        ->      M3UA    ASPUP           Same TSN as above
> >
> > There are no signs of any SACKs for the ASPUP, I think they have the
> > correct TSN (the same value as in the INIT_ACK).
> > No signs of any shutdowns or aborts from either system.
> >
> > As seems to be typical for M3UA the source and destination ports are
> > the same. No additional IP addresses appear in the INIT (etc) messages.
> 
> I think I've reproduced this on a 3.14.0 kernel.
> 
> System A: Bind to port 1234, connect to B:1234.
>           If the connect fails, retry 10 seconds later.
>           When the connection completes send some data.
>           Disconnect if the reflected data isn't received within 2 seconds.
> System B: Bind to port 1234, connect to A:1234.
>           If the connect fails, retry 10 seconds later.
>           Reflect any received data.

Add here, setsockopt(sock, SO_LINGER, { 1, 0 }, ...);
If no data is received with a few seconds, close() the socket
(do not call shutdown()), and retry.

Initially the INIT chunks generate ABORTs (no listener) so both
programs just retry every 10 seconds.

On B run:
    iptables -A OUPUT -p sctp --chunk-types any ABORT -j DROP
    iptables -A INPUT -p sctp --chunk-types any DATA -j DROP
The first allows the connection to complete, and then drops the
ABORT sent by close().
The second stops B acking the data.

System A now receives a new INIT (with a different TSN) and responds with
an INIT_ACK (followed by a COOKIE_ECHO and COOKIE_ACK) even though
it doesn't have a socket in a suitable state for the connection.

I think the INIT should act as a received ABORT on the old connection,
and then be processed as a new connection - in this case generating
an ABORT because there is no listening socket.

With the code I'm running the INIT is repeated every 30 seconds.
No sign of any DATA retransmits after the first INIT (for over 20 minutes now).

I suspect that a simpler test of forcing a disconnect to use an ABORT and
using iptables to discard the ABORT would be enough to show the problem.

	David



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ