[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230307102233.bemr47x625ity26z@apollo>
Date: Tue, 7 Mar 2023 11:22:33 +0100
From: Kumar Kartikeya Dwivedi <memxor@...il.com>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: Joanne Koong <joannelkoong@...il.com>, bpf <bpf@...r.kernel.org>,
Martin KaFai Lau <martin.lau@...nel.org>,
Andrii Nakryiko <andrii@...nel.org>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Network Development <netdev@...r.kernel.org>,
Toke Høiland-Jørgensen <toke@...nel.org>
Subject: Re: [PATCH v13 bpf-next 09/10] bpf: Add bpf_dynptr_slice and
bpf_dynptr_slice_rdwr
On Tue, Mar 07, 2023 at 03:23:25AM CET, Alexei Starovoitov wrote:
> On Sun, Mar 5, 2023 at 11:10 PM Kumar Kartikeya Dwivedi
> <memxor@...il.com> wrote:
> >
> > On Wed, Mar 01, 2023 at 04:49:52PM CET, Joanne Koong wrote:
> > > Two new kfuncs are added, bpf_dynptr_slice and bpf_dynptr_slice_rdwr.
> > > The user must pass in a buffer to store the contents of the data slice
> > > if a direct pointer to the data cannot be obtained.
> > >
> > > For skb and xdp type dynptrs, these two APIs are the only way to obtain
> > > a data slice. However, for other types of dynptrs, there is no
> > > difference between bpf_dynptr_slice(_rdwr) and bpf_dynptr_data.
> > >
> > > For skb type dynptrs, the data is copied into the user provided buffer
> > > if any of the data is not in the linear portion of the skb. For xdp type
> > > dynptrs, the data is copied into the user provided buffer if the data is
> > > between xdp frags.
> > >
> > > If the skb is cloned and a call to bpf_dynptr_data_rdwr is made, then
> > > the skb will be uncloned (see bpf_unclone_prologue()).
> > >
> > > Please note that any bpf_dynptr_write() automatically invalidates any prior
> > > data slices of the skb dynptr. This is because the skb may be cloned or
> > > may need to pull its paged buffer into the head. As such, any
> > > bpf_dynptr_write() will automatically have its prior data slices
> > > invalidated, even if the write is to data in the skb head of an uncloned
> > > skb. Please note as well that any other helper calls that change the
> > > underlying packet buffer (eg bpf_skb_pull_data()) invalidates any data
> > > slices of the skb dynptr as well, for the same reasons.
> > >
> > > Signed-off-by: Joanne Koong <joannelkoong@...il.com>
> > > ---
> >
> > Sorry for chiming in late.
> >
> > I see one potential hole in bpf_dynptr_slice_rdwr. If the returned pointer is
> > actually pointing to the stack (but verified as a PTR_TO_MEM in verifier state),
> > we won't reflect changes to the stack state in the verifier for writes happening
> > through it.
> >
> > For the worst case scenario, this will basically allow overwriting values of
> > spilled pointers and doing arbitrary kernel memory reads/writes. This is only an
> > issue when bpf_dynptr_slice_rdwr at runtime returns a pointer to the supplied
> > buffer residing on program stack. To verify, by forcing the memcpy to buffer for
> > skb_header_pointer I was able to make it dereference a garbage value for
> > l4lb_all selftest.
> >
> > --- a/kernel/bpf/helpers.c
> > +++ b/kernel/bpf/helpers.c
> > @@ -2253,7 +2253,13 @@ __bpf_kfunc void *bpf_dynptr_slice(const struct bpf_dynptr_kern *ptr, u32 offset
> > case BPF_DYNPTR_TYPE_RINGBUF:
> > return ptr->data + ptr->offset + offset;
> > case BPF_DYNPTR_TYPE_SKB:
> > - return skb_header_pointer(ptr->data, ptr->offset + offset, len, buffer);
> > + {
> > + void *p = skb_header_pointer(ptr->data, ptr->offset + offset, len, buffer);
> > + if (p == buffer)
> > + return p;
> > + memcpy(buffer, p, len);
> > + return buffer;
> > + }
> >
> > --- a/tools/testing/selftests/bpf/progs/test_l4lb_noinline_dynptr.c
> > +++ b/tools/testing/selftests/bpf/progs/test_l4lb_noinline_dynptr.c
> > @@ -470,7 +470,10 @@ int balancer_ingress(struct __sk_buff *ctx)
> > eth = bpf_dynptr_slice_rdwr(&ptr, 0, buffer, sizeof(buffer));
> > if (!eth)
> > return TC_ACT_SHOT;
> > - eth_proto = eth->eth_proto;
> > + *(void **)buffer = ctx;
>
> Great catch.
> To fix the issue I think we should simply disallow such
> stack abuse. The compiler won't be spilling registers
> into C array on the stack.
> This manual spill/fill is exploiting verifier logic.
> After bpf_dynptr_slice_rdwr() we can mark all slots of the
> buffer as STACK_POISON or some better name and
> reject spill into such slots.
>
I agree this is simpler, but I'm not sure it will work properly. Verifier won't
know when the lifetime of the buffer ends, so if we disallow spills until its
written over it's going to be a pain for users.
Something like:
for (...) {
char buf[64];
bpf_dynptr_slice_rdwr(..., buf, 64);
...
}
.. and then compiler decides to spill something where buf was located on stack
outside the for loop. The verifier can't know when buf goes out of scope to
unpoison the slots.
> > + *(void **)eth = (void *)0xdeadbeef;
> > + ctx = *(void **)buffer;
> > + eth_proto = eth->eth_proto + ctx->len;
> > if (eth_proto == bpf_htons(ETH_P_IP))
> > err = process_packet(&ptr, eth, nh_off, false, ctx);
> >
> > I think the proper fix is to treat it as a separate return type distinct from
> > PTR_TO_MEM like PTR_TO_MEM_OR_PKT (or handle PTR_TO_MEM | DYNPTR_* specially),
> > fork verifier state whenever there is a write, so that one path verifies it as
> > PTR_TO_PACKET, while another as PTR_TO_STACK (if buffer was a stack ptr). I
> > think for the rest it's not a problem, but there are allow_ptr_leak checks
> > applied to PTR_TO_STACK and PTR_TO_MAP_VALUE, so that needs to be rechecked.
> > Then we ensure that program is safe in either path.
> >
> > Also we need to fix regsafe to not consider other PTR_TO_MEMs equivalent to such
> > a pointer. We could also fork verifier states on return, to verify either path
> > separately right from the point following the call instruction.
>
> This is too complex imo.
A better way to phrase this is to verify with R0 = PTR_TO_PACKET in one path,
and push_stack with R0 = buffer's reg->type + size set to len in the other path
for exploration later. In terms of verifier infra everything is there already,
it just needs to analyze both cases which fall into the regular code handling
the reg->type's. Probably then no adjustments to regsafe are needed either. It's
like exploring branch instructions.
Powered by blists - more mailing lists