[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120720183629.GE22367@hmsreliant.think-freely.org>
Date: Fri, 20 Jul 2012 14:36:29 -0400
From: Neil Horman <nhorman@...driver.com>
To: Vlad Yasevich <vyasevich@...il.com>
Cc: netdev@...r.kernel.org, Sridhar Samudrala <sri@...ibm.com>,
"David S. Miller" <davem@...emloft.net>,
linux-sctp@...r.kernel.org, joe@...ches.com
Subject: Re: [PATCH v4] sctp: Implement quick failover draft from tsvwg
On Fri, Jul 20, 2012 at 01:55:35PM -0400, Vlad Yasevich wrote:
> On 07/20/2012 01:19 PM, Neil Horman wrote:
> >I've seen several attempts recently made to do quick failover of sctp transports
> >by reducing various retransmit timers and counters. While its possible to
> >implement a faster failover on multihomed sctp associations, its not
> >particularly robust, in that it can lead to unneeded retransmits, as well as
> >false connection failures due to intermittent latency on a network.
> >
> >Instead, lets implement the new ietf quick failover draft found here:
> >http://tools.ietf.org/html/draft-nishida-tsvwg-sctp-failover-05
> >
> >This will let the sctp stack identify transports that have had a small number of
> >errors, and avoid using them quickly until their reliability can be
> >re-established. I've tested this out on two virt guests connected via multiple
> >isolated virt networks and believe its in compliance with the above draft and
> >works well.
> >
> >Signed-off-by: Neil Horman <nhorman@...driver.com>
> >CC: Vlad Yasevich <vyasevich@...il.com>
> >CC: Sridhar Samudrala <sri@...ibm.com>
> >CC: "David S. Miller" <davem@...emloft.net>
> >CC: linux-sctp@...r.kernel.org
> >CC: joe@...ches.com
> >
> >---
> >Change notes:
> >
> >V2)
> >- Added socket option API from section 6.1 of the specification, as per
> >request from Vlad. Adding this socket option allows us to alter both the path
> >maximum retransmit value and the path partial failure threshold for each
> >transport and the association as a whole.
> >
> >- Added a per transport pf_retrans value, and initialized it from the
> >association value. This makes each transport independently configurable as per
> >the socket option above, and prevents changes in the sysctl from bleeding into
> >an already created association.
> >
> >V3)
> >- Cleaned up some line spacing (Joe Perches)
> >- Fixed some socket option user data sanitization (Vlad Yasevich)
> >
> >V4)
> >- Added additional documentation (Flavio Leitner)
> >---
> > Documentation/networking/ip-sysctl.txt | 14 +++++
> > include/net/sctp/constants.h | 1 +
> > include/net/sctp/structs.h | 20 ++++++-
> > include/net/sctp/user.h | 11 ++++
> > net/sctp/associola.c | 37 ++++++++++--
> > net/sctp/outqueue.c | 6 +-
> > net/sctp/sm_sideeffect.c | 33 +++++++++-
> > net/sctp/socket.c | 100 ++++++++++++++++++++++++++++++++
> > net/sctp/sysctl.c | 9 +++
> > net/sctp/transport.c | 4 +-
> > 10 files changed, 220 insertions(+), 15 deletions(-)
> >
>
> [ snip ]
>
> >
> >diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> >index b3b8a8d..fef9bfa 100644
> >--- a/net/sctp/socket.c
> >+++ b/net/sctp/socket.c
> >@@ -3470,6 +3470,56 @@ static int sctp_setsockopt_auto_asconf(struct sock *sk, char __user *optval,
> > }
> >
> >
> >+/*
> >+ * SCTP_PEER_ADDR_THLDS
> >+ *
> >+ * This option allows us to alter the partially failed threshold for one or all
> >+ * transports in an association. See Section 6.1 of:
> >+ * http://www.ietf.org/id/draft-nishida-tsvwg-sctp-failover-05.txt
> >+ */
> >+static int sctp_setsockopt_paddr_thresholds(struct sock *sk,
> >+ char __user *optval,
> >+ unsigned int optlen)
> >+{
> >+ struct sctp_paddrthlds val;
> >+ struct sctp_transport *trans;
> >+ struct sctp_association *asoc;
> >+
> >+ if (optlen < sizeof(struct sctp_paddrthlds))
> >+ return -EINVAL;
> >+ if (copy_from_user(&val, (struct sctp_paddrthlds __user *)optval,
> >+ sizeof(struct sctp_paddrthlds)))
> >+ return -EFAULT;
> >+
> >+ /* path_max_retrans shouldn't ever be zero */
> >+ if (!val.spt_pathmaxrxt)
> >+ return -EINVAL;
>
> I am not sure I like this solution. This means that the application
> must fetch the pathmaxrx and then write the same value back here.
> Why not simply ignore the patthmaxrxt if it's 0? That way someone
> can just tweak the pf value without changing the pathmaxrxt.
>
>
Yeah, I can make that change.
Neil
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists