[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20190726152302.754694627@linuxfoundation.org>
Date: Fri, 26 Jul 2019 17:24:56 +0200
From: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
To: linux-kernel@...r.kernel.org
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
stable@...r.kernel.org, Eric Dumazet <edumazet@...gle.com>,
Lawrence Brakmo <brakmo@...com>,
Neal Cardwell <ncardwell@...gle.com>,
"David S. Miller" <davem@...emloft.net>
Subject: [PATCH 4.19 21/50] tcp: fix tcp_set_congestion_control() use from bpf hook
From: Eric Dumazet <edumazet@...gle.com>
[ Upstream commit 8d650cdedaabb33e85e9b7c517c0c71fcecc1de9 ]
Neal reported incorrect use of ns_capable() from bpf hook.
bpf_setsockopt(...TCP_CONGESTION...)
-> tcp_set_congestion_control()
-> ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)
-> ns_capable_common()
-> current_cred()
-> rcu_dereference_protected(current->cred, 1)
Accessing 'current' in bpf context makes no sense, since packets
are processed from softirq context.
As Neal stated : The capability check in tcp_set_congestion_control()
was written assuming a system call context, and then was reused from
a BPF call site.
The fix is to add a new parameter to tcp_set_congestion_control(),
so that the ns_capable() call is only performed under the right
context.
Fixes: 91b5b21c7c16 ("bpf: Add support for changing congestion control")
Signed-off-by: Eric Dumazet <edumazet@...gle.com>
Cc: Lawrence Brakmo <brakmo@...com>
Reported-by: Neal Cardwell <ncardwell@...gle.com>
Acked-by: Neal Cardwell <ncardwell@...gle.com>
Acked-by: Lawrence Brakmo <brakmo@...com>
Signed-off-by: David S. Miller <davem@...emloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@...uxfoundation.org>
---
include/net/tcp.h | 3 ++-
net/core/filter.c | 2 +-
net/ipv4/tcp.c | 4 +++-
net/ipv4/tcp_cong.c | 6 +++---
4 files changed, 9 insertions(+), 6 deletions(-)
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1054,7 +1054,8 @@ void tcp_get_default_congestion_control(
void tcp_get_available_congestion_control(char *buf, size_t len);
void tcp_get_allowed_congestion_control(char *buf, size_t len);
int tcp_set_allowed_congestion_control(char *allowed);
-int tcp_set_congestion_control(struct sock *sk, const char *name, bool load, bool reinit);
+int tcp_set_congestion_control(struct sock *sk, const char *name, bool load,
+ bool reinit, bool cap_net_admin);
u32 tcp_slow_start(struct tcp_sock *tp, u32 acked);
void tcp_cong_avoid_ai(struct tcp_sock *tp, u32 w, u32 acked);
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3991,7 +3991,7 @@ BPF_CALL_5(bpf_setsockopt, struct bpf_so
TCP_CA_NAME_MAX-1));
name[TCP_CA_NAME_MAX-1] = 0;
ret = tcp_set_congestion_control(sk, name, false,
- reinit);
+ reinit, true);
} else {
struct tcp_sock *tp = tcp_sk(sk);
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2729,7 +2729,9 @@ static int do_tcp_setsockopt(struct sock
name[val] = 0;
lock_sock(sk);
- err = tcp_set_congestion_control(sk, name, true, true);
+ err = tcp_set_congestion_control(sk, name, true, true,
+ ns_capable(sock_net(sk)->user_ns,
+ CAP_NET_ADMIN));
release_sock(sk);
return err;
}
--- a/net/ipv4/tcp_cong.c
+++ b/net/ipv4/tcp_cong.c
@@ -332,7 +332,8 @@ out:
* tcp_reinit_congestion_control (if the current congestion control was
* already initialized.
*/
-int tcp_set_congestion_control(struct sock *sk, const char *name, bool load, bool reinit)
+int tcp_set_congestion_control(struct sock *sk, const char *name, bool load,
+ bool reinit, bool cap_net_admin)
{
struct inet_connection_sock *icsk = inet_csk(sk);
const struct tcp_congestion_ops *ca;
@@ -368,8 +369,7 @@ int tcp_set_congestion_control(struct so
} else {
err = -EBUSY;
}
- } else if (!((ca->flags & TCP_CONG_NON_RESTRICTED) ||
- ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN))) {
+ } else if (!((ca->flags & TCP_CONG_NON_RESTRICTED) || cap_net_admin)) {
err = -EPERM;
} else if (!try_module_get(ca->owner)) {
err = -EBUSY;
Powered by blists - more mailing lists