[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20181103.000018.2014183024812085135.davem@davemloft.net>
Date: Sat, 03 Nov 2018 00:00:18 -0700 (PDT)
From: David Miller <davem@...emloft.net>
To: dhowells@...hat.com
Cc: netdev@...r.kernel.org, linux-afs@...ts.infradead.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH net] rxrpc: Fix lockup due to no error backoff after
ack transmit error
From: David Howells <dhowells@...hat.com>
Date: Thu, 01 Nov 2018 13:39:53 +0000
> If the network becomes (partially) unavailable, say by disabling IPv6, the
> background ACK transmission routine can get itself into a tizzy by
> proposing immediate ACK retransmission. Since we're in the call event
> processor, that happens immediately without returning to the workqueue
> manager.
>
> The condition should clear after a while when either the network comes back
> or the call times out.
>
> Fix this by:
>
> (1) When re-proposing an ACK on failed Tx, don't schedule it immediately.
> This will allow a certain amount of time to elapse before we try
> again.
>
> (2) Enforce a return to the workqueue manager after a certain number of
> iterations of the call processing loop.
>
> (3) Add a backoff delay that increases the delay on deferred ACKs by a
> jiffy per failed transmission to a limit of HZ. The backoff delay is
> cleared on a successful return from kernel_sendmsg().
>
> (4) Cancel calls immediately if the opening sendmsg fails. The layer
> above can arrange retransmission or rotate to another server.
>
> Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
> Signed-off-by: David Howells <dhowells@...hat.com>
Applied and queued up for -stable.
Powered by blists - more mailing lists