[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211012052019.184398-1-liujian56@huawei.com>
Date: Tue, 12 Oct 2021 13:20:19 +0800
From: Liu Jian <liujian56@...wei.com>
To: <john.fastabend@...il.com>, <daniel@...earbox.net>,
<jakub@...udflare.com>, <lmb@...udflare.com>,
<edumazet@...gle.com>, <davem@...emloft.net>,
<yoshfuji@...ux-ipv6.org>, <dsahern@...nel.org>, <kuba@...nel.org>,
<ast@...nel.org>, <andrii@...nel.org>, <kafai@...com>,
<songliubraving@...com>, <yhs@...com>, <kpsingh@...nel.org>,
<netdev@...r.kernel.org>, <bpf@...r.kernel.org>
CC: <liujian56@...wei.com>
Subject: [PATHC bpf v2] tcp_bpf: Fix one concurrency problem in the tcp_bpf_send_verdict function
With two Msgs, msgA and msgB and a user doing nonblocking sendmsg calls (or
multiple cores) on a single socket 'sk' we could get the following flow.
msgA, sk msgB, sk
----------- ---------------
tcp_bpf_sendmsg()
lock(sk)
psock = sk->psock
tcp_bpf_sendmsg()
lock(sk) ... blocking
tcp_bpf_send_verdict
if (psock->eval == NONE)
psock->eval = sk_psock_msg_verdict
..
< handle SK_REDIRECT case >
release_sock(sk) < lock dropped so grab here >
ret = tcp_bpf_sendmsg_redir
psock = sk->psock
tcp_bpf_send_verdict
lock_sock(sk) ... blocking on B
if (psock->eval == NONE) <- boom.
psock->eval will have msgA state
The problem here is we dropped the lock on msgA and grabbed it with msgB.
Now we have old state in psock and importantly psock->eval has not been
cleared. So msgB will run whatever action was done on A and the verdict
program may never see it.
Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Liu Jian <liujian56@...wei.com>
---
v2: change commit message, and add the fixes tag
net/ipv4/tcp_bpf.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index d3e9386b493e..9d068153c316 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -232,6 +232,7 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock,
bool cork = false, enospc = sk_msg_full(msg);
struct sock *sk_redir;
u32 tosend, delta = 0;
+ u32 eval = __SK_NONE;
int ret;
more_data:
@@ -275,13 +276,24 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock,
case __SK_REDIRECT:
sk_redir = psock->sk_redir;
sk_msg_apply_bytes(psock, tosend);
+ if (!psock->apply_bytes) {
+ /* Clean up before releasing the sock lock. */
+ eval = psock->eval;
+ psock->eval = __SK_NONE;
+ psock->sk_redir = NULL;
+ }
if (psock->cork) {
cork = true;
psock->cork = NULL;
}
sk_msg_return(sk, msg, tosend);
release_sock(sk);
+
ret = tcp_bpf_sendmsg_redir(sk_redir, msg, tosend, flags);
+
+ if (eval == __SK_REDIRECT)
+ sock_put(sk_redir);
+
lock_sock(sk);
if (unlikely(ret < 0)) {
int free = sk_msg_free_nocharge(sk, msg);
--
2.17.1
Powered by blists - more mailing lists