[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <063D6719AE5E284EB5DD2968C1650D6D1CB7979F@AcuExch.aculab.com>
Date: Wed, 12 Aug 2015 15:33:51 +0000
From: David Laight <David.Laight@...LAB.COM>
To: 'Marcelo Ricardo Leitner' <marcelo.leitner@...il.com>,
"cluster-devel@...hat.com" <cluster-devel@...hat.com>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
Vlad Yasevich <vyasevich@...il.com>,
Neil Horman <nhorman@...driver.com>,
David Teigland <teigland@...hat.com>,
"tan.hu@....com.cn" <tan.hu@....com.cn>
Subject: RE: [PATCH 4/6] dlm: use sctp 1-to-1 API
From: Marcelo Ricardo Leitner
> Sent: 12 August 2015 14:16
> Em 12-08-2015 07:23, David Laight escreveu:
> > From: Marcelo Ricardo Leitner
> >> Sent: 11 August 2015 23:22
> >> DLM is using 1-to-many API but in a 1-to-1 fashion. That is, it's not
> >> needed but this causes it to use sctp_do_peeloff() to mimic an
> >> kernel_accept() and this causes a symbol dependency on sctp module.
> >>
> >> By switching it to 1-to-1 API we can avoid this dependency and also
> >> reduce quite a lot of SCTP-specific code in lowcomms.c.
> > ...
> >
> > You still need to enable sctp notifications (I think the patch deleted
> > that code).
> > Otherwise you don't get any kind of indication if the remote system
> > 'resets' (ie sends an new INIT chunk) on an existing connection.
>
> Right, it would miss the restart event and could generate a corrupted
> tx/rx buffers by glueing parts of old messages with new ones.
Except that it is SCTP so you'd expect DATA chunks to contain entire
messages and so get unexpected message sequences rather than corrupt
messages.
The problem is that the recovery is likely to be another reset.
(Particularly with M3UA where the source and destination port numbers
are likely to be the same and fixed.)
> > It is probably enough to treat the MSG_NOTIFICATION as a fatal error
> > and close the socket.
>
> Just so we are on the same page, you mean that after accepting the new
> association and enabling notifications on it, any further notification
> on it can be treated as fatal errors, right? Seems reasonable to me.
That's what I had to do.
The far end will probably see an additional disconnect, but it shouldn't
matter.
> > This is probably a bug in the sctp stack - if a connection is reset
> > but the user hasn't requested notifications then it should be
> > converted to a disconnect indication and a new incoming connection.
>
> Maybe in such case resets shouldn't be allowed at all? Because unless it
> happens on a moment of silence it will always lead to application buffer
> corruption. Checked the RFCs now but couldn't find anything restricting
> them to some condition.
I certainly expected the 'reset' to cause an inwards abortive disconnect
on the old socket and a new indication on the listening socket.
I think (hope) that is what you get for a TCP SYN that matches an existing
connection.
In our case I think they were happening when the remote system was power
cycled.
David
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists