[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <46d726f90703302325k3473ff1fy86e732d657f671b@mail.gmail.com>
Date: Sat, 31 Mar 2007 08:25:09 +0200
From: "Predrag Hodoba" <predrag.hodoba@...il.com>
To: "Rick Jones" <rick.jones2@...com>
Cc: "Stephen Hemminger" <shemminger@...ux-foundation.org>,
"David Miller" <davem@...emloft.net>, dagriego@...il.com,
netdev@...r.kernel.org
Subject: Re: [PATCH] NET: Add TCP connection abort IOCTL
On 30/03/07, Rick Jones <rick.jones2@...com> wrote:
> If the switchover from active to standby is "commanded" then there is
> the opportunity to "tell" the applications on the server to close their
> connections - either explicitly with some sort of defined interface, or
> implicitly by killing the processes. Then the IP can be brought-up on
> the standby and processes started/enabled/whatever and the clients can
> establish their new connections. The ioctl here (at least if it is like
> the tcp_discon options in HP-UX/Solaris) wouldn't be any better than
> just killing the process in so far as what happens on the network - in
> fact, it could be worse since the RST will not be retransmitted if lost,
> but FINs would. So, the ioctl could still leave clients twisting in the
> ether waiting for their application-level heartbeats to kick-in anyway.
> Heck, depending on their heartbeat lengths, even the FIN stuff if lost
> could leave them depending on their heartbeats.
>
> If the switchover from active to standby is "uncommanded" it probably
> means the primary went belly-up which means you don't have the
> opportunity to make an ioctl call anyway, and you are back to the
> heartbeats.
>
> rick jones
What I meant is - it could be used on ***client***. Because clients
are left stranded with invalid connections when a primary fails (your
"uncommanded" switchover scenario). If you wait for them to timeout,
that will indeed happen, but it takes time and you are not back online
as fast as you would like. If cluster's services running on a client
already know about the failover (by means of "heartbeat" and observing
change in cluster membership), then they can propagate that knowledge
to all processes uneccessarily blocked in their socket calls towards
the failed IP address. If these connections are forcibly disconnected,
the respective sockets' calls would return with error code and their
processes can reconnect in few seconds after the failure and continue
to do what they are meant to do.
predrag
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists