[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <425d6e31-cee2-26f1-c2dd-5988eaf758f7@oracle.com>
Date: Thu, 30 Nov 2017 12:28:45 -0800
From: Santosh Shilimkar <santosh.shilimkar@...cle.com>
To: Sowmini Varadhan <sowmini.varadhan@...cle.com>,
netdev@...r.kernel.org
Cc: davem@...emloft.net, rds-devel@....oracle.com
Subject: Re: [PATCH net-next 2/3] rds: tcp: correctly sequence cleanup on
netns deletion.
On 11/30/2017 11:11 AM, Sowmini Varadhan wrote:
> Commit 8edc3affc077 ("rds: tcp: Take explicit refcounts on struct net")
> introduces a regression in rds-tcp netns cleanup. The cleanup_net(),
> (and thus rds_tcp_dev_event notification) is only called from put_net()
> when all netns refcounts go to 0, but this cannot happen if the
> rds_connection itself is holding a c_net ref that it expects to
> release in rds_tcp_kill_sock.
>
> Instead, the rds_tcp_kill_sock callback should make sure to
> tear down state carefully, ensuring that the socket teardown
> is only done after all data-structures and workqs that depend
> on it are quiesced.
>
> The original motivation for commit 8edc3affc077 ("rds: tcp: Take explicit
> refcounts on struct net") was to resolve a race condition reported by
> syzkaller where workqs for tx/rx/connect were triggered after the
> namespace was deleted. Those worker threads should have been
> cancelled/flushed before socket tear-down and indeed,
> rds_conn_path_destroy() does try to sequence this by doing
> /* cancel cp_send_w */
> /* cancel cp_recv_w */
> /* flush cp_down_w */
> /* free data structures */
> Here the "flush cp_down_w" will trigger rds_conn_shutdown and thus
> invoke rds_tcp_conn_path_shutdown() to close the tcp socket, so that
> we ought to have satisfied the requirement that "socket-close is
> done after all other dependent state is quiesced". However,
> rds_conn_shutdown has a bug in that it *always* triggers the reconnect
> workq (and if connection is successful, we always restart tx/rx
> workqs so with the right timing, we risk the race conditions reported
> by syzkaller).
>
> Netns deletion is like module teardown- no need to restart a
> reconnect in this case. We can use the c_destroy_in_prog bit
> to avoid restarting the reconnect.
>
> Fixes: 8edc3affc077 ("rds: tcp: Take explicit refcounts on struct net")
> Signed-off-by: Sowmini Varadhan <sowmini.varadhan@...cle.com>
> ---
> net/rds/connection.c | 3 ++-
> net/rds/rds.h | 6 +++---
> net/rds/tcp.c | 4 ++--
> 3 files changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/net/rds/connection.c b/net/rds/connection.c
> index 7ee2d5d..9efc82c 100644
> --- a/net/rds/connection.c
> +++ b/net/rds/connection.c
> @@ -366,6 +366,8 @@ void rds_conn_shutdown(struct rds_conn_path *cp)
> * to the conn hash, so we never trigger a reconnect on this
> * conn - the reconnect is always triggered by the active peer. */
> cancel_delayed_work_sync(&cp->cp_conn_w);
> + if (conn->c_destroy_in_prog)
> + return;
Not related to this patch but it will be more safe to use
cp_flags or if needed add flag and conn level for bundle
and use bit wise to avoid possible races to set c_destroy_in_prog.
Something similar to RDS_DESTROY_PENDING etc.
The patch itself looks good to me in terms of netns ref counting.
Acked-by: Santosh Shilimkar <santosh.shilimkar@...cle.com>
Powered by blists - more mailing lists