[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20201123121824.467166287@linuxfoundation.org>
Date: Mon, 23 Nov 2020 13:22:02 +0100
From: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To: linux-kernel@...r.kernel.org
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
stable@...r.kernel.org, Jakub Sitnicki <jakub@...udflare.com>,
John Fastabend <john.fastabend@...il.com>,
Daniel Borkmann <daniel@...earbox.net>,
Sasha Levin <sashal@...nel.org>
Subject: [PATCH 5.4 094/158] bpf, sockmap: Ensure SO_RCVBUF memory is observed on ingress redirect
From: John Fastabend <john.fastabend@...il.com>
[ Upstream commit 36cd0e696a832a00247fca522034703566ac8885 ]
Fix sockmap sk_skb programs so that they observe sk_rcvbuf limits. This
allows users to tune SO_RCVBUF and sockmap will honor them.
We can refactor the if(charge) case out in later patches. But, keep this
fix to the point.
Fixes: 51199405f9672 ("bpf: skb_verdict, support SK_PASS on RX BPF path")
Suggested-by: Jakub Sitnicki <jakub@...udflare.com>
Signed-off-by: John Fastabend <john.fastabend@...il.com>
Signed-off-by: Daniel Borkmann <daniel@...earbox.net>
Reviewed-by: Jakub Sitnicki <jakub@...udflare.com>
Link: https://lore.kernel.org/bpf/160556568657.73229.8404601585878439060.stgit@john-XPS-13-9370
Signed-off-by: Sasha Levin <sashal@...nel.org>
---
net/core/skmsg.c | 20 ++++++++++++++++----
net/ipv4/tcp_bpf.c | 3 ++-
2 files changed, 18 insertions(+), 5 deletions(-)
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 118cf1ace43a6..1f8e3445cd2f0 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -170,10 +170,12 @@ static int sk_msg_free_elem(struct sock *sk, struct sk_msg *msg, u32 i,
struct scatterlist *sge = sk_msg_elem(msg, i);
u32 len = sge->length;
- if (charge)
- sk_mem_uncharge(sk, len);
- if (!msg->skb)
+ /* When the skb owns the memory we free it from consume_skb path. */
+ if (!msg->skb) {
+ if (charge)
+ sk_mem_uncharge(sk, len);
put_page(sg_page(sge));
+ }
memset(sge, 0, sizeof(*sge));
return len;
}
@@ -403,6 +405,9 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb)
int copied = 0, num_sge;
struct sk_msg *msg;
+ if (atomic_read(&sk->sk_rmem_alloc) > sk->sk_rcvbuf)
+ return -EAGAIN;
+
msg = kzalloc(sizeof(*msg), __GFP_NOWARN | GFP_ATOMIC);
if (unlikely(!msg))
return -EAGAIN;
@@ -418,7 +423,14 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb)
return num_sge;
}
- sk_mem_charge(sk, skb->len);
+ /* This will transition ownership of the data from the socket where
+ * the BPF program was run initiating the redirect to the socket
+ * we will eventually receive this data on. The data will be released
+ * from skb_consume found in __tcp_bpf_recvmsg() after its been copied
+ * into user buffers.
+ */
+ skb_set_owner_r(skb, sk);
+
copied = skb->len;
msg->sg.start = 0;
msg->sg.size = copied;
diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index efd098b00104b..819255ee4e42d 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -77,7 +77,8 @@ int __tcp_bpf_recvmsg(struct sock *sk, struct sk_psock *psock,
if (likely(!peek)) {
sge->offset += copy;
sge->length -= copy;
- sk_mem_uncharge(sk, copy);
+ if (!msg_rx->skb)
+ sk_mem_uncharge(sk, copy);
msg_rx->sg.size -= copy;
if (!sge->length) {
--
2.27.0
Powered by blists - more mailing lists