[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5f3feecc-65ad-af5f-0ecd-94b2605ab67e@I-love.SAKURA.ne.jp>
Date: Thu, 5 May 2022 00:15:46 +0900
From: Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
To: Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>
Cc: patchwork-bot+netdevbpf@...nel.org,
Santosh Shilimkar <santosh.shilimkar@...cle.com>,
David Miller <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
syzbot <syzbot+694120e1002c117747ed@...kaller.appspotmail.com>,
netdev <netdev@...r.kernel.org>,
syzkaller-bugs <syzkaller-bugs@...glegroups.com>,
linux-rdma <linux-rdma@...r.kernel.org>
Subject: Re: [PATCH v2] net: rds: acquire refcount on TCP sockets
On 2022/05/04 13:58, Tetsuo Handa wrote:
> On 2022/05/04 12:09, Eric Dumazet wrote:
>> This exit() handler _has_ to remove all known listeners, and
>> definitely cancel work queues (synchronous operation)
>> before the actual "struct net" free can happen later.
>
> But in your report, rds_tcp_tune() is called from rds_tcp_conn_path_connect() from
> rds_connect_worker() via "struct rds_connection"->cp_conn_w work. I can see that
> rds_tcp_kill_sock() calls rds_tcp_listen_stop(lsock, &rtn->rds_tcp_accept_w), and
> rds_tcp_listen_stop() calls flush_workqueue(rds_wq) and flush_work(&rtn->rds_tcp_accept_w).
>
> But I can't see how rds_tcp_exit_net() synchronously cancels all works associated
> with "struct rds_conn_path".
>
> struct rds_conn_path {
> struct delayed_work cp_send_w;
> struct delayed_work cp_recv_w;
> struct delayed_work cp_conn_w;
> struct work_struct cp_down_w;
> }
>
> These works are queued to rds_wq, but flush_workqueue() waits for completion only
> if already queued. What if timer for queue_delayed_work() has not expired, or was
> about to call queue_delayed_work() ? Is flush_workqueue(rds_wq) sufficient?
rds_tcp_tune+0x5a0/0x5f0 net/rds/tcp.c:503
rds_tcp_conn_path_connect+0x489/0x880 net/rds/tcp_connect.c:127
rds_connect_worker+0x1a5/0x2c0 net/rds/threads.c:176
process_one_work+0x996/0x1610 kernel/workqueue.c:2289
rds_tcp_conn_path_connect is referenced by
"struct rds_transport rds_tcp_transport"->conn_path_connect.
It is invoked by
ret = conn->c_trans->conn_path_connect(cp)
in rds_connect_worker().
rds_connect_worker is referenced by "struct rds_conn_path"->cp_conn_w
via INIT_DELAYED_WORK().
queue_delayed_work(rds_wq, &cp->cp_conn_w, *) is called by
rds_queue_reconnect() or rds_conn_path_connect_if_down().
If rds_conn_path_connect_if_down() were called from
rds_tcp_accept_one_path() from rds_tcp_accept_one(),
rds_tcp_tune() from rds_tcp_accept_one() was already called
before rds_tcp_tune() from rds_tcp_conn_path_connect() is called.
Since the addition on 0 was not reported at rds_tcp_tune() from
rds_tcp_accept_one(), what Eric is reporting cannot be from
rds_tcp_accept_one() from rds_tcp_accept_worker().
Despite rds_tcp_kill_sock() sets rtn->rds_tcp_listen_sock = NULL and
waits for rds_tcp_accept_one() from rds_tcp_accept_worker() to complete
using flush_workqueue(rds_wq), what Eric is reporting is different from
what syzbot+694120e1002c117747ed was reporting.
>
> Anyway, if rds_tcp_kill_sock() can somehow guarantee that all works are completed
> or cancelled, the fix would look like something below?
I think it is OK to apply below diff in order to avoid addition on 0 problem, but
it is not proven that kmem_cache_free() is not yet called. What should we do?
>
> net/rds/tcp.c | 11 ++++++++---
> net/rds/tcp.h | 2 +-
> net/rds/tcp_connect.c | 5 ++++-
> net/rds/tcp_listen.c | 5 ++++-
> 4 files changed, 17 insertions(+), 6 deletions(-)
>
Powered by blists - more mailing lists