[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20190122170035.GB26773@mini-arch>
Date: Tue, 22 Jan 2019 09:00:35 -0800
From: Stanislav Fomichev <sdf@...ichev.me>
To: Andrey Ignatov <rdna@...com>
Cc: Stanislav Fomichev <sdf@...gle.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"davem@...emloft.net" <davem@...emloft.net>,
"ast@...nel.org" <ast@...nel.org>,
"daniel@...earbox.net" <daniel@...earbox.net>,
"edumazet@...gle.com" <edumazet@...gle.com>
Subject: Re: [PATCH bpf-next 0/5] add bpf cgroup hooks that trigger on socket
close
On 01/18, Andrey Ignatov wrote:
> Stanislav Fomichev <sdf@...ichev.me> [Fri, 2019-01-18 08:50 -0800]:
> > On 01/18, Andrey Ignatov wrote:
> > > Stanislav Fomichev <sdf@...gle.com> [Thu, 2019-01-17 16:41 -0800]:
> > > > Currently, we have BPF_CGROUP_INET_SOCK_CREATE hook that triggers on
> > > > socket creation and there is no way to know when the socket is being
> > > > closed. Add new set of hooks BPF_CGROUP_INET{4,6}_SOCK_RELEASE
> > > > that trigger when the socket is closed.
> > > >
> > > > Initial intended usecase is to cleanup statistics after POST{4,6}_BIND.
> > > > Hooks have read-only access to all fields of struct bpf_sock.
> > >
> > > Do you need it for both TCP and UDP?
> > Yes, we need both TCP and UDP. Although, UDP is tricky in general with
> > the connected/unconnected cases.
> >
> > > I was thinking about this hook earlier but since in my case only TCP was
> > > needed I ended up using TCP-BPF. E.g. be BPF_SOCK_OPS_TCP_LISTEN_CB or
> > > BPF_SOCK_OPS_TCP_CONNECT_CB can be used instead of POST{4,6}_BIND to
> > > enable something, and then BPF_SOCK_OPS_STATE_CB can be used instead of
> > > SOCK_RELEASE to disable that something when socket transisions to
> > > BPF_TCP_CLOSE (e.g. BPF_TCP_LISTEN -> BPF_TCP_CLOSE).
> > >
> > > That turned out to be much cleaner than POST{4,6}_BIND and also works
> > > fine when socket is disconnected with AF_UNSPEC and then connected again
> > > (what Eric mentioned).
> >
> > What if we do something like the patch below? Add pre_release hook (like we
> > currently have for pre_connect) and call it from connect(AF_UNSPEC) and
> > from inet_release? Any concerns here?
>
> That's hard to say w/o fully understanding the use-case. Could you
> provide more details on it please?
The use-case is a simple lightweight per-cgroup bpf audit (without involving
audit subsystem).
Basically record the following events: when the socket is created, when (and
where) it's connected/bound and when it's torn down (plus, _maybe_ rx/tx
stats).
> Is my understanding correct that some statistics is needed only for
> _client_ sockets (since we're talking about connect) and these client
> sockets are always bound to local IP:port (or maybe IP only with
> IP_BIND_ADDRESS_NO_PORT?) before sending data and that's why
> POST{4,6}_BIND is used to start gathering statistics?
>
> And then later, there should be some event to cleanup statistics and you
> chose ops->release for this? If so what's kind of statistics is needed?
> Is it per-destination statistics (since AF_UNSPEC matters?)? If so then
> it should be cleaned up whenever destination is changed or when socket
> is disconnected the last time.
>
> For UDP socket connect(2) can be called many times with different
> destinations even w/o AF_UNSPEC in between (in contrast to TCP). So your
> patch with pre_release below won't handle it. Furthermore even if
> connect(2) was used to set destination, sendmsg(2) can override it for
> individual datagram.
>
> If per-destination statistics is needed for UDP client socket then IMO
> both "start events" should be handled: BPF_CGROUP_INET{4,6}_CONNECT
> (connected UDP) and BPF_CGROUP_UDP{4,6}_SENDMSG (unconnected UDP). The
> former can be used to "close" previous statistics "window" (if it's not
> the first connect) as well. The latter can be used to created "separate"
> statistics "window" for destination of current datagram only. And the
> only case left, from what I see, is when socket is stopped being used
> completely. That's, again, hard to say how to handle it better w/o
> understanding use-case (e.g. since it's only UDP problem it may make
> sense to implement in UDP stack).
>
> > (I agree that TCP is probably better handled via BPF_SOCK_OPS_TCP_XYZ hooks,
> > but we need something for UDP as well)
>
> Do you think you may handle TCP with TCP-BPF and this patch set may
> focus on UDP only if there is no good attach point that covers both TCP
> and UDP? That may lead to different BPF programs for TCP and UDP but
> that may be hard to avoid anyway (BPF_CGROUP_UDP{4,6}_SENDMSG is a good
> example here).
Let me look closer into TCP-BPF hooks, it looks like that would be
enough for TCP at least. I'll get back to you in case there is an issue.
I'll also think about UDP use case. Maybe we should do as you suggested
and support release only in UDP. (Or maybe have release for TCP+UDP and an
additional 'disconnect' hook to handle connect(AF_UNSPEC), that might
be more generic).
But thanks for you input, really appreciate it!
>
> >
> > --
> >
> > diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> > index b703ad242365..ee3dc181df8f 100644
> > --- a/net/ipv4/af_inet.c
> > +++ b/net/ipv4/af_inet.c
> > @@ -568,8 +568,11 @@ int inet_dgram_connect(struct socket *sock, struct sockaddr *uaddr,
> >
> > if (addr_len < sizeof(uaddr->sa_family))
> > return -EINVAL;
> > - if (uaddr->sa_family == AF_UNSPEC)
> > + if (uaddr->sa_family == AF_UNSPEC) {
> > + if (BPF_CGROUP_PRE_RELEASE_ENABLED(sk))
> > + sk->sk_prot->pre_release(sk);
> > return sk->sk_prot->disconnect(sk, flags);
> > + }
> >
> > if (BPF_CGROUP_PRE_CONNECT_ENABLED(sk)) {
> > err = sk->sk_prot->pre_connect(sk, uaddr, addr_len);
> > @@ -632,6 +635,8 @@ int __inet_stream_connect(struct socket *sock, struct sockaddr *uaddr,
> > return -EINVAL;
> >
> > if (uaddr->sa_family == AF_UNSPEC) {
> > + if (BPF_CGROUP_PRE_RELEASE_ENABLED(sk))
> > + sk->sk_prot->pre_release(sk);
> > err = sk->sk_prot->disconnect(sk, flags);
> > sock->state = err ? SS_DISCONNECTING : SS_UNCONNECTED;
> > goto out;
> >
> > > > First patch adds hooks, the rest of the patches add uapi and tests to make
> > > > sure these hooks work.
> > > >
> > > > Stanislav Fomichev (5):
> > > > bpf: add BPF_CGROUP_INET{4,6}_SOCK_RELEASE hooks
> > > > tools: bpf: support BPF_CGROUP_INET{4,6}_SOCK_RELEASE in
> > > > libbpf/bpftool
> > > > selftests/bpf: add BPF_CGROUP_INET{4,6}_SOCK_RELEASE to
> > > > test_section_names.c
> > > > selftests/bpf: add BPF_CGROUP_INET{4,6}_SOCK_RELEASE to test_sock.c
> > > > selftests/bpf: add BPF_CGROUP_INET{4,6}_SOCK_RELEASE to
> > > > test_sock_addr.c
> > > >
> > > > include/linux/bpf-cgroup.h | 6 +
> > > > include/net/inet_common.h | 1 +
> > > > include/uapi/linux/bpf.h | 2 +
> > > > kernel/bpf/syscall.c | 8 ++
> > > > net/core/filter.c | 7 +
> > > > net/ipv4/af_inet.c | 13 +-
> > > > net/ipv6/af_inet6.c | 5 +-
> > > > tools/bpf/bpftool/cgroup.c | 2 +
> > > > tools/include/uapi/linux/bpf.h | 2 +
> > > > tools/lib/bpf/libbpf.c | 4 +
> > > > .../selftests/bpf/test_section_names.c | 10 ++
> > > > tools/testing/selftests/bpf/test_sock.c | 119 ++++++++++++++++
> > > > tools/testing/selftests/bpf/test_sock_addr.c | 131 +++++++++++++++++-
> > > > 13 files changed, 307 insertions(+), 3 deletions(-)
> > > >
> > > > --
> > > > 2.20.1.321.g9e740568ce-goog
> > > >
> > >
> > > --
> > > Andrey Ignatov
>
> --
> Andrey Ignatov
Powered by blists - more mailing lists