[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160816183330.GD3110@localhost.localdomain>
Date: Tue, 16 Aug 2016 15:33:30 -0300
From: Marcelo Ricardo Leitner <marcelo.leitner@...il.com>
To: Xin Long <lucien.xin@...il.com>
Cc: David Laight <David.Laight@...lab.com>,
David Miller <davem@...emloft.net>,
network dev <netdev@...r.kernel.org>,
"linux-sctp@...r.kernel.org" <linux-sctp@...r.kernel.org>,
Vladislav Yasevich <vyasevich@...il.com>,
"daniel@...earbox.net" <daniel@...earbox.net>
Subject: Re: [PATCH net] sctp: fix a success return may hide an error
On Wed, Aug 17, 2016 at 02:24:19AM +0800, Xin Long wrote:
> >> > This err returns back to sctp_sendmsg, there sctp will abort asoc.
> >
> > That's not right I think. sctp_sendmsg will only free the asoc if it was
> > created to send that specific chunk. And in this case, this change
> > should have no effect as it can't have sctp_outq_flush() touching
> > several transports in a row.
> >
> > I'm basing on:
> > out_free:
> > if (new_asoc)
> > sctp_association_free(asoc);
> >
> > and sctp_recvmsg will just fetch, return and clear the error via
> > sctp_skb_recv_datagram, but not free it.
> >
> > Do you see any other place freeing it?
> Sorry, you are right, it free assoc just for new_asoc.
>
> >
> >>
> >> That doesn't seem a good idea.
> >> You don't want to abort the association if there is a transient
> >> memory allocation failure.
> >> You also can't drop data chunks.
> >
> > From a system-wise POV, this behavior - to free the new asoc in case of
> > transient memory allocation failure - doesn't seem bad to me.
> > That's what will have to happen if any allocation before it failed and
> > also it helps the system to reduce the stress a little bit. I don't see
> > any inconsistency/problems here because we are not dropping a single
> > random chunk but instead we are actually refusing to initialize a new
> > asoc in such conditions.
> >
> > Nevertheless, I agree that letting the application see ENOMEM errors when
> > the data actually got queued and is being fully handled, as in, it will
> > be retransmitted later, is not be wise, as the application probably
> > won't be able to distinguish from ENOMEMs that it should retry or not.
> > Here I see a problem, yet it's not due to this specific change, perhaps
> > it just got attention because of it. In this situation, we should handle
> > ENOMEMs internally if possible so the application can know that if it
> > hits an ENOMEM, it's real and it has to retry.
> If letting the application see ENOMEM errors, and sctp has to drop this
> chunk, instead of retransmiting the ENOMEM chunk, but the ENOMEM
> chunk may not be the chunk from current msg, as it flush all the queue.
> even if users get an ENOMEM error, they may re-send a chunk that is same
> with the one that is still in retransmit queue.
Yep, one more reason to handle those internally when safe.
Powered by blists - more mailing lists