[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110706121508.GA19518@hmsreliant.think-freely.org>
Date: Wed, 6 Jul 2011 08:15:09 -0400
From: Neil Horman <nhorman@...driver.com>
To: Vladislav Yasevich <vladislav.yasevich@...com>,
netdev@...r.kernel.org, davem@...emloft.net,
Wei Yongjun <yjwei@...fujitsu.com>,
Sridhar Samudrala <sri@...ibm.com>, linux-sctp@...r.kernel.org
Subject: Re: [PATCHv2] sctp: Enforce retransmission limit during shutdown
On Mon, Jul 04, 2011 at 09:50:19AM -0400, Thomas Graf wrote:
> When initiating a graceful shutdown while having data chunks
> on the retransmission queue with a peer which is in zero
> window mode the shutdown is never completed because the
> retransmission error count is reset periodically by the
> following two rules:
>
> - Do not timeout association while doing zero window probe.
> - Reset overall error count when a heartbeat request has
> been acknowledged.
>
> The graceful shutdown will wait for all outstanding TSN to
> be acknowledged before sending the SHUTDOWN request. This
> never happens due to the peer's zero window not acknowledging
> the continuously retransmitted data chunks. Although the
> error counter is incremented for each failed retransmission,
> the receiving of the SACK announcing the zero window clears
> the error count again immediately. Also heartbeat requests
> continue to be sent periodically. The peer acknowledges these
> requests causing the error counter to be reset as well.
>
> This patch changes behaviour to only reset the overall error
> counter for the above rules while not in shutdown. After
> reaching the maximum number of retransmission attempts, the
> T5 shutdown guard timer is scheduled to give the receiver
> some additional time to recover. The timer is stopped as soon
> as the receiver acknowledges any data.
>
> The issue can be easily reproduced by establishing a sctp
> association over the loopback device, constantly queueing
> data at the sender while not reading any at the receiver.
> Wait for the window to reach zero, then initiate a shutdown
> by killing both processes simultaneously. The association
> will never be freed and the chunks on the retransmission
> queue will be retransmitted indefinitely.
>
> Signed-off-by: Thomas Graf <tgraf@...radead.org>
<snip>
> --- a/net/sctp/sm_statefuns.c
> +++ b/net/sctp/sm_statefuns.c
> @@ -5154,7 +5154,7 @@ sctp_disposition_t sctp_sf_do_9_2_start_shutdown(
> * The sender of the SHUTDOWN MAY also start an overall guard timer
> * 'T5-shutdown-guard' to bound the overall time for shutdown sequence.
> */
> - sctp_add_cmd_sf(commands, SCTP_CMD_TIMER_START,
> + sctp_add_cmd_sf(commands, SCTP_CMD_TIMER_RESTART,
> SCTP_TO(SCTP_EVENT_TIMEOUT_T5_SHUTDOWN_GUARD));
>
How come you're modifying this chunk to use TIMER_RESTART rather than
TIMER_START? start shutdown is where the t5 timer is actually started, isn't it?
The rest, I think looks ok to me.
Neil
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists