[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <YAswxL1dZhdbAseP@rdna-mbp.dhcp.thefacebook.com>
Date: Fri, 22 Jan 2021 12:08:36 -0800
From: Andrey Ignatov <rdna@...com>
To: Stanislav Fomichev <sdf@...gle.com>
CC: Martin KaFai Lau <kafai@...com>, Netdev <netdev@...r.kernel.org>,
bpf <bpf@...r.kernel.org>, Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>
Subject: Re: [PATCH bpf-next 1/2] bpf: allow rewriting to ports under
ip_unprivileged_port_start
Stanislav Fomichev <sdf@...gle.com> [Fri, 2021-01-22 11:54 -0800]:
> On Fri, Jan 22, 2021 at 11:37 AM Andrey Ignatov <rdna@...com> wrote:
> >
> > Stanislav Fomichev <sdf@...gle.com> [Wed, 2021-01-20 18:09 -0800]:
> > > At the moment, BPF_CGROUP_INET{4,6}_BIND hooks can rewrite user_port
> > > to the privileged ones (< ip_unprivileged_port_start), but it will
> > > be rejected later on in the __inet_bind or __inet6_bind.
> > >
> > > Let's export 'port_changed' event from the BPF program and bypass
> > > ip_unprivileged_port_start range check when we've seen that
> > > the program explicitly overrode the port. This is accomplished
> > > by generating instructions to set ctx->port_changed along with
> > > updating ctx->user_port.
> > >
> > > Signed-off-by: Stanislav Fomichev <sdf@...gle.com>
> > > ---
> > ...
> > > @@ -244,17 +245,27 @@ int bpf_percpu_cgroup_storage_update(struct bpf_map *map, void *key,
> > > if (cgroup_bpf_enabled(type)) { \
> > > lock_sock(sk); \
> > > __ret = __cgroup_bpf_run_filter_sock_addr(sk, uaddr, type, \
> > > - t_ctx); \
> > > + t_ctx, NULL); \
> > > release_sock(sk); \
> > > } \
> > > __ret; \
> > > })
> > >
> > > -#define BPF_CGROUP_RUN_PROG_INET4_BIND_LOCK(sk, uaddr) \
> > > - BPF_CGROUP_RUN_SA_PROG_LOCK(sk, uaddr, BPF_CGROUP_INET4_BIND, NULL)
> > > -
> > > -#define BPF_CGROUP_RUN_PROG_INET6_BIND_LOCK(sk, uaddr) \
> > > - BPF_CGROUP_RUN_SA_PROG_LOCK(sk, uaddr, BPF_CGROUP_INET6_BIND, NULL)
> > > +#define BPF_CGROUP_RUN_PROG_INET_BIND_LOCK(sk, uaddr, type, flags) \
> > > +({ \
> > > + bool port_changed = false; \
> >
> > I see the discussion with Martin in [0] on the program overriding the
> > port but setting exactly same value as it already contains. Commenting
> > on this patch since the code is here.
> >
> > From what I understand there is no use-case to support overriding the
> > port w/o changing the value to just bypass the capability. In this case
> > the code can be simplified.
> >
> > Here instead of introducing port_changed you can just remember the
> > original ((struct sockaddr_in *)uaddr)->sin_port or
> > ((struct sockaddr_in6 *)uaddr)->sin6_port (they have same offset/size so
> > it can be simplified same way as in sock_addr_convert_ctx_access() for
> > user_port) ...
> >
> > > + int __ret = 0; \
> > > + if (cgroup_bpf_enabled(type)) { \
> > > + lock_sock(sk); \
> > > + __ret = __cgroup_bpf_run_filter_sock_addr(sk, uaddr, type, \
> > > + NULL, \
> > > + &port_changed); \
> > > + release_sock(sk); \
> > > + if (port_changed) \
> >
> > ... and then just compare the original and the new ports here.
> >
> > The benefits will be:
> > * no need to introduce port_changed field in struct bpf_sock_addr_kern;
> > * no need to do change program instructions;
> > * no need to think about compiler optimizing out those instructions;
> > * no need to think about multiple programs coordination, the flag will
> > be set only if port has actually changed what is easy to reason about
> > from user perspective.
> >
> > wdyt?
> Martin mentioned in another email that we might want to do that when
> we rewrite only the address portion of it.
> I think it makes sense. Imagine doing 1.1.1.1:50 -> 2.2.2.2:50 it
> seems like it should also work, right?
> And in this case, we need to store and compare addresses as well and
> it becomes messy :-/
Why does address matter? CAP_NET_BIND_SERVICE is only about ports, not
addresses.
IMO address change should not matter to bypass CAP_NET_BIND_SERVICE in
this case and correspondingly there should not be a need to compare
addresses, only port should be enough.
> It also seems like it would be nice to have this 'bypass
> cap_net_bind_service" without changing the address while we are at it.
Yeah, this part determines the behaviour. I guess it should be use-case
driven. So far it seems to be more like "nice to have" rather than a
real-use case exists, but I could miss it, please correct me if it's the
case.
--
Andrey Ignatov
Powered by blists - more mailing lists