[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190118223926.GA15896@rdna-mbp.dhcp.thefacebook.com>
Date: Fri, 18 Jan 2019 22:39:30 +0000
From: Andrey Ignatov <rdna@...com>
To: Stanislav Fomichev <sdf@...ichev.me>
CC: Stanislav Fomichev <sdf@...gle.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"davem@...emloft.net" <davem@...emloft.net>,
"ast@...nel.org" <ast@...nel.org>,
"daniel@...earbox.net" <daniel@...earbox.net>,
"edumazet@...gle.com" <edumazet@...gle.com>
Subject: Re: [PATCH bpf-next 0/5] add bpf cgroup hooks that trigger on socket
close
Stanislav Fomichev <sdf@...ichev.me> [Fri, 2019-01-18 08:50 -0800]:
> On 01/18, Andrey Ignatov wrote:
> > Stanislav Fomichev <sdf@...gle.com> [Thu, 2019-01-17 16:41 -0800]:
> > > Currently, we have BPF_CGROUP_INET_SOCK_CREATE hook that triggers on
> > > socket creation and there is no way to know when the socket is being
> > > closed. Add new set of hooks BPF_CGROUP_INET{4,6}_SOCK_RELEASE
> > > that trigger when the socket is closed.
> > >
> > > Initial intended usecase is to cleanup statistics after POST{4,6}_BIND.
> > > Hooks have read-only access to all fields of struct bpf_sock.
> >
> > Do you need it for both TCP and UDP?
> Yes, we need both TCP and UDP. Although, UDP is tricky in general with
> the connected/unconnected cases.
>
> > I was thinking about this hook earlier but since in my case only TCP was
> > needed I ended up using TCP-BPF. E.g. be BPF_SOCK_OPS_TCP_LISTEN_CB or
> > BPF_SOCK_OPS_TCP_CONNECT_CB can be used instead of POST{4,6}_BIND to
> > enable something, and then BPF_SOCK_OPS_STATE_CB can be used instead of
> > SOCK_RELEASE to disable that something when socket transisions to
> > BPF_TCP_CLOSE (e.g. BPF_TCP_LISTEN -> BPF_TCP_CLOSE).
> >
> > That turned out to be much cleaner than POST{4,6}_BIND and also works
> > fine when socket is disconnected with AF_UNSPEC and then connected again
> > (what Eric mentioned).
>
> What if we do something like the patch below? Add pre_release hook (like we
> currently have for pre_connect) and call it from connect(AF_UNSPEC) and
> from inet_release? Any concerns here?
That's hard to say w/o fully understanding the use-case. Could you
provide more details on it please?
Is my understanding correct that some statistics is needed only for
_client_ sockets (since we're talking about connect) and these client
sockets are always bound to local IP:port (or maybe IP only with
IP_BIND_ADDRESS_NO_PORT?) before sending data and that's why
POST{4,6}_BIND is used to start gathering statistics?
And then later, there should be some event to cleanup statistics and you
chose ops->release for this? If so what's kind of statistics is needed?
Is it per-destination statistics (since AF_UNSPEC matters?)? If so then
it should be cleaned up whenever destination is changed or when socket
is disconnected the last time.
For UDP socket connect(2) can be called many times with different
destinations even w/o AF_UNSPEC in between (in contrast to TCP). So your
patch with pre_release below won't handle it. Furthermore even if
connect(2) was used to set destination, sendmsg(2) can override it for
individual datagram.
If per-destination statistics is needed for UDP client socket then IMO
both "start events" should be handled: BPF_CGROUP_INET{4,6}_CONNECT
(connected UDP) and BPF_CGROUP_UDP{4,6}_SENDMSG (unconnected UDP). The
former can be used to "close" previous statistics "window" (if it's not
the first connect) as well. The latter can be used to created "separate"
statistics "window" for destination of current datagram only. And the
only case left, from what I see, is when socket is stopped being used
completely. That's, again, hard to say how to handle it better w/o
understanding use-case (e.g. since it's only UDP problem it may make
sense to implement in UDP stack).
> (I agree that TCP is probably better handled via BPF_SOCK_OPS_TCP_XYZ hooks,
> but we need something for UDP as well)
Do you think you may handle TCP with TCP-BPF and this patch set may
focus on UDP only if there is no good attach point that covers both TCP
and UDP? That may lead to different BPF programs for TCP and UDP but
that may be hard to avoid anyway (BPF_CGROUP_UDP{4,6}_SENDMSG is a good
example here).
>
> --
>
> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> index b703ad242365..ee3dc181df8f 100644
> --- a/net/ipv4/af_inet.c
> +++ b/net/ipv4/af_inet.c
> @@ -568,8 +568,11 @@ int inet_dgram_connect(struct socket *sock, struct sockaddr *uaddr,
>
> if (addr_len < sizeof(uaddr->sa_family))
> return -EINVAL;
> - if (uaddr->sa_family == AF_UNSPEC)
> + if (uaddr->sa_family == AF_UNSPEC) {
> + if (BPF_CGROUP_PRE_RELEASE_ENABLED(sk))
> + sk->sk_prot->pre_release(sk);
> return sk->sk_prot->disconnect(sk, flags);
> + }
>
> if (BPF_CGROUP_PRE_CONNECT_ENABLED(sk)) {
> err = sk->sk_prot->pre_connect(sk, uaddr, addr_len);
> @@ -632,6 +635,8 @@ int __inet_stream_connect(struct socket *sock, struct sockaddr *uaddr,
> return -EINVAL;
>
> if (uaddr->sa_family == AF_UNSPEC) {
> + if (BPF_CGROUP_PRE_RELEASE_ENABLED(sk))
> + sk->sk_prot->pre_release(sk);
> err = sk->sk_prot->disconnect(sk, flags);
> sock->state = err ? SS_DISCONNECTING : SS_UNCONNECTED;
> goto out;
>
> > > First patch adds hooks, the rest of the patches add uapi and tests to make
> > > sure these hooks work.
> > >
> > > Stanislav Fomichev (5):
> > > bpf: add BPF_CGROUP_INET{4,6}_SOCK_RELEASE hooks
> > > tools: bpf: support BPF_CGROUP_INET{4,6}_SOCK_RELEASE in
> > > libbpf/bpftool
> > > selftests/bpf: add BPF_CGROUP_INET{4,6}_SOCK_RELEASE to
> > > test_section_names.c
> > > selftests/bpf: add BPF_CGROUP_INET{4,6}_SOCK_RELEASE to test_sock.c
> > > selftests/bpf: add BPF_CGROUP_INET{4,6}_SOCK_RELEASE to
> > > test_sock_addr.c
> > >
> > > include/linux/bpf-cgroup.h | 6 +
> > > include/net/inet_common.h | 1 +
> > > include/uapi/linux/bpf.h | 2 +
> > > kernel/bpf/syscall.c | 8 ++
> > > net/core/filter.c | 7 +
> > > net/ipv4/af_inet.c | 13 +-
> > > net/ipv6/af_inet6.c | 5 +-
> > > tools/bpf/bpftool/cgroup.c | 2 +
> > > tools/include/uapi/linux/bpf.h | 2 +
> > > tools/lib/bpf/libbpf.c | 4 +
> > > .../selftests/bpf/test_section_names.c | 10 ++
> > > tools/testing/selftests/bpf/test_sock.c | 119 ++++++++++++++++
> > > tools/testing/selftests/bpf/test_sock_addr.c | 131 +++++++++++++++++-
> > > 13 files changed, 307 insertions(+), 3 deletions(-)
> > >
> > > --
> > > 2.20.1.321.g9e740568ce-goog
> > >
> >
> > --
> > Andrey Ignatov
--
Andrey Ignatov
Powered by blists - more mailing lists