[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20190610161015.GF9660@mini-arch>
Date: Mon, 10 Jun 2019 09:10:15 -0700
From: Stanislav Fomichev <sdf@...ichev.me>
To: Martin Lau <kafai@...com>
Cc: Stanislav Fomichev <sdf@...gle.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"bpf@...r.kernel.org" <bpf@...r.kernel.org>,
"davem@...emloft.net" <davem@...emloft.net>,
"ast@...nel.org" <ast@...nel.org>,
"daniel@...earbox.net" <daniel@...earbox.net>,
Andrii Nakryiko <andriin@...com>
Subject: Re: [PATCH bpf-next v3 1/8] bpf: implement getsockopt and setsockopt
hooks
On 06/08, Martin Lau wrote:
> On Fri, Jun 07, 2019 at 09:29:13AM -0700, Stanislav Fomichev wrote:
> > Implement new BPF_PROG_TYPE_CGROUP_SOCKOPT program type and
> > BPF_CGROUP_{G,S}ETSOCKOPT cgroup hooks.
> >
> > BPF_CGROUP_SETSOCKOPT get a read-only view of the setsockopt arguments.
> > BPF_CGROUP_GETSOCKOPT can modify the supplied buffer.
> > Both of them reuse existing PTR_TO_PACKET{,_END} infrastructure.
> >
> > The buffer memory is pre-allocated (because I don't think there is
> > a precedent for working with __user memory from bpf). This might be
> > slow to do for each {s,g}etsockopt call, that's why I've added
> > __cgroup_bpf_prog_array_is_empty that exits early if there is nothing
> > attached to a cgroup. Note, however, that there is a race between
> > __cgroup_bpf_prog_array_is_empty and BPF_PROG_RUN_ARRAY where cgroup
> > program layout might have changed; this should not be a problem
> > because in general there is a race between multiple calls to
> > {s,g}etsocktop and user adding/removing bpf progs from a cgroup.
> >
> > The return code of the BPF program is handled as follows:
> > * 0: EPERM
> > * 1: success, execute kernel {s,g}etsockopt path after BPF prog exits
> > * 2: success, do _not_ execute kernel {s,g}etsockopt path after BPF
> > prog exits
> >
> > v3:
> > * typos in BPF_PROG_CGROUP_SOCKOPT_RUN_ARRAY comments (Andrii Nakryiko)
> > * reverse christmas tree in BPF_PROG_CGROUP_SOCKOPT_RUN_ARRAY (Andrii
> > Nakryiko)
> > * use __bpf_md_ptr instead of __u32 for optval{,_end} (Martin Lau)
> > * use BPF_FIELD_SIZEOF() for consistency (Martin Lau)
> > * new CG_SOCKOPT_ACCESS macro to wrap repeated parts
> >
> > v2:
> > * moved bpf_sockopt_kern fields around to remove a hole (Martin Lau)
> > * aligned bpf_sockopt_kern->buf to 8 bytes (Martin Lau)
> > * bpf_prog_array_is_empty instead of bpf_prog_array_length (Martin Lau)
> > * added [0,2] return code check to verifier (Martin Lau)
> > * dropped unused buf[64] from the stack (Martin Lau)
> > * use PTR_TO_SOCKET for bpf_sockopt->sk (Martin Lau)
> > * dropped bpf_target_off from ctx rewrites (Martin Lau)
> > * use return code for kernel bypass (Martin Lau & Andrii Nakryiko)
> >
>
> > diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
> > index 1b65ab0df457..4fc8429af6fc 100644
> > --- a/kernel/bpf/cgroup.c
> > +++ b/kernel/bpf/cgroup.c
>
> [ ... ]
>
> > +static const struct bpf_func_proto *
> > +cg_sockopt_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> > +{
> > + switch (func_id) {
> > + case BPF_FUNC_sk_fullsock:
> > + return &bpf_sk_fullsock_proto;
> May be my v2 comment has been missed.
>
> sk here (i.e. PTR_TO_SOCKET) must be a fullsock.
> bpf_sk_fullsock() will be a no-op. Hence, there is
> no need to expose bpf_sk_fullsock_proto.
I think I missed that fact that PTR_TO_SOCKET implies fullsock.
Will remove, thanks!
> > + case BPF_FUNC_sk_storage_get:
> > + return &bpf_sk_storage_get_proto;
> > + case BPF_FUNC_sk_storage_delete:
> > + return &bpf_sk_storage_delete_proto;
> > +#ifdef CONFIG_INET
> > + case BPF_FUNC_tcp_sock:
> > + return &bpf_tcp_sock_proto;
> > +#endif
> > + default:
> > + return cgroup_base_func_proto(func_id, prog);
> > + }
> > +}
> > +
> > +static bool cg_sockopt_is_valid_access(int off, int size,
> > + enum bpf_access_type type,
> > + const struct bpf_prog *prog,
> > + struct bpf_insn_access_aux *info)
> > +{
> > + const int size_default = sizeof(__u32);
> > +
> > + if (off < 0 || off >= sizeof(struct bpf_sockopt))
> > + return false;
> > +
> > + if (off % size != 0)
> > + return false;
> > +
> > + if (type == BPF_WRITE) {
> > + switch (off) {
> > + case offsetof(struct bpf_sockopt, optlen):
> > + if (size != size_default)
> > + return false;
> > + return prog->expected_attach_type ==
> > + BPF_CGROUP_GETSOCKOPT;
> > + default:
> > + return false;
> > + }
> > + }
> > +
> > + switch (off) {
> > + case offsetof(struct bpf_sockopt, sk):
> > + if (size != sizeof(struct bpf_sock *))
> Based on my understanding in commit b7df9ada9a77 ("bpf: fix pointer offsets in context for 32 bit"),
> I think it should be 'size != sizeof(__u64)'
>
> Same for the optval and optval_end below.
Good point. I was actually wondering when converting BPF_W to BPF_DW in
the tests whether that would work correctly on 32 bits. Thanks for
commit pointer, that should, indeed, always be all sizeof(__u64).
> > + return false;
> > + info->reg_type = PTR_TO_SOCKET;
> > + break;
> > + case bpf_ctx_range(struct bpf_sockopt, optval):
> offsetof(struct bpf_sockopt, optval)
Ack. No narrow loads for the pointers.
> > + if (size != sizeof(void *))
> > + return false;
> > + info->reg_type = PTR_TO_PACKET;
> > + break;
> > + case bpf_ctx_range(struct bpf_sockopt, optval_end):
> offsetof(struct bpf_sockopt, optval_end)
>
> > + if (size != sizeof(void *))
> > + return false;
> > + info->reg_type = PTR_TO_PACKET_END;
> > + break;
> > + default:
> > + if (size != size_default)
> > + return false;
> > + break;
> > + }
> > + return true;
> > +}
> > +
>
> [ ... ]
>
> > diff --git a/net/core/filter.c b/net/core/filter.c
> > index 55bfc941d17a..4652c0a005ca 100644
> > --- a/net/core/filter.c
> > +++ b/net/core/filter.c
> > @@ -1835,7 +1835,7 @@ BPF_CALL_1(bpf_sk_fullsock, struct sock *, sk)
> > return sk_fullsock(sk) ? (unsigned long)sk : (unsigned long)NULL;
> > }
> >
> > -static const struct bpf_func_proto bpf_sk_fullsock_proto = {
> > +const struct bpf_func_proto bpf_sk_fullsock_proto = {
> As mentioned above, this change is also not needed.
>
> Others LGTM.
Agreed, will not export.
> > .func = bpf_sk_fullsock,
> > .gpl_only = false,
> > .ret_type = RET_PTR_TO_SOCKET_OR_NULL,
> > @@ -5636,7 +5636,7 @@ BPF_CALL_1(bpf_tcp_sock, struct sock *, sk)
> > return (unsigned long)NULL;
> > }
> >
Powered by blists - more mailing lists