[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 6 Jun 2014 15:14:42 +0000
From: David Laight <David.Laight@...LAB.COM>
To: David Laight <David.Laight@...LAB.COM>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: SCTP seems to lose its socket state.
From: David Laight
> I've been looking at an ethernet trace from one of our customers.
> They seem to have got an SCTP socket into a rather confused state.
>
> There seem to be a significant number of transmit ethernet frames
> that don't read the far end.
> This shouldn't cause a real problem, but we end up with the following:
> This trace was taken on the linux system:
>
> 39964 0.304473 -> SCTP INIT
> 39965 0.292669 <- SCTP INIT (I think this has an invalid checksum)
> 39968 0.467935 <- SCTP INIT
> 39969 0.000093 -> SCTP INIT_ACK
> 39970 0.003947 <- SCTP COOKIE_ECHO
> 39971 0.000072 -> SCTP COOKIE_ACK
> 39972 0.000337 -> M3UA ASPUP
> 39979 0.809659 <- SCTP COOKIE_ECHO
> 39980 0.000058 -> SCTP COOKIE_ACK
> shutdown() called here - seems to be ignored
> 39983 0.949471 <- SCTP COOKIE_ECHO
> 39984 0.000053 -> SCTP COOKIE_ACK
> 39986 0.730072 -> M3UA ASPUP Same TSN as above
> 40002 0.270589 -> M3UA ASPUP Same TSN as above
> 40008 3.689088 <- SCTP HEARTBEAT
> 40009 0.000027 -> SCTP HEARTBEAT_ACK
> 40014 0.261152 <- SCTP HEARTBEAT
> 40015 0.000033 -> SCTP HEARTBEAT_ACK
> 40026 0.123048 <- SCTP HEARTBEAT
> 40027 0.000030 -> SCTP HEARTBEAT_ACK
> 40036 1.615048 -> M3UA ASPUP Same TSN as above
>
> There are no signs of any SACKs for the ASPUP, I think they have the
> correct TSN (the same value as in the INIT_ACK).
> No signs of any shutdowns or aborts from either system.
>
> As seems to be typical for M3UA the source and destination ports are
> the same. No additional IP addresses appear in the INIT (etc) messages.
I think I've reproduced this on a 3.14.0 kernel.
System A: Bind to port 1234, connect to B:1234.
If the connect fails, retry 10 seconds later.
When the connection completes send some data.
Disconnect if the reflected data isn't received within 2 seconds.
System B: Bind to port 1234, connect to A:1234.
If the connect fails, retry 10 seconds later.
Reflect any received data.
Initially the INIT chunks generate ABORTs (no listener) so both
programs just retry every 10 seconds.
On B run:
iptables -A INPUT -p sctp --chunk-types any INIT -j DROP
iptables -A INPUT -p sctp --chunk-types any DATA -j DROP
The first allows the connection to complete.
The second stops B acking the data.
The data is resent on timeout, and the systems exchange HBs.
I'd expect that a SHUTDOWN or ABORT be sent reasonably quickly.
But the systems just exchange HBs for over 5 minutes.
(I'm seeing an ABORT because B gives up waiting for the message.)
If I discard the COOKIE_ECHO then I do see an outwards disconnect
after a few retries.
I'm testing with sockets created by our M3UA kernel driver,
and system B is running a much older kernel (2.6.26).
Neither should make any difference.
David
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists