[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211219190810.p3q52rrlchnokufo@ast-mbp>
Date: Sun, 19 Dec 2021 11:08:10 -0800
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Kumar Kartikeya Dwivedi <memxor@...il.com>
Cc: bpf <bpf@...r.kernel.org>,
Network Development <netdev@...r.kernel.org>,
netfilter-devel <netfilter-devel@...r.kernel.org>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>,
Martin KaFai Lau <kafai@...com>,
Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
John Fastabend <john.fastabend@...il.com>,
Maxim Mikityanskiy <maximmi@...dia.com>,
Pablo Neira Ayuso <pablo@...filter.org>,
Florian Westphal <fw@...len.de>,
Jesper Dangaard Brouer <brouer@...hat.com>,
Toke Høiland-Jørgensen <toke@...hat.com>
Subject: Re: [PATCH bpf-next v4 06/10] bpf: Track provenance for pointers
formed from referenced PTR_TO_BTF_ID
On Sun, Dec 19, 2021 at 11:40:44PM +0530, Kumar Kartikeya Dwivedi wrote:
> On Sun, Dec 19, 2021 at 11:13:18PM IST, Alexei Starovoitov wrote:
> > On Sat, Dec 18, 2021 at 9:25 PM Kumar Kartikeya Dwivedi
> > <memxor@...il.com> wrote:
> > >
> > > On Sun, Dec 19, 2021 at 10:35:18AM IST, Alexei Starovoitov wrote:
> > > > On Sat, Dec 18, 2021 at 8:33 PM Kumar Kartikeya Dwivedi
> > > > <memxor@...il.com> wrote:
> > > > >
> > > > > It is, but into parent_ref_obj_id, to match during release_reference.
> > > > >
> > > > > > Shouldn't r2 get a different ref_obj_id after r2 = r1->next ?
> > > > >
> > > > > It's ref_obj_id is still 0.
> > > > >
> > > > > Thinking about this more, we actually only need 1 extra bit of information in
> > > > > reg_state, not even a new member. We can simply copy ref_obj_id and set this
> > > > > bit, then we can reject this register during release but consider it during
> > > > > release_reference.
> > > >
> > > > It seems to me that this patch created the problem and it's trying
> > > > to fix it at the same time.
> > > >
> > >
> > > Yes, sort of. Maybe I need to improve the commit message? I give an example
> > > below, and the first half of commit explains that if we simply did copy
> > > ref_obj_id, it would lead to the case in the previous mail (same BTF ID ptr can
> > > be passed), so we need to do something different.
> > >
> > > Maybe that is what is confusing you.
> >
> > I'm still confused.
> > Why does mark_btf_ld_reg() need to copy ref_obj_id ?
> > It should keep it as zero.
>
> So that we can find deref pointers obtained from the reg having that ref_obj_id
> when it is released, and invalidate them. But since directly storing in
> ref_obj_id of deref dst_reg will be bad (if we get same BTF ID from deref we
> could now pass it to release kfunc), we add a new member and match on that.
>
> > mark_btf_ld_reg() is used in deref only.
> > The ref_obj_id is assigned by check_helper_call().
> > r2 = r0; will copy it, but
> > r2 = r0->next; will keep r2->ref_obj_id as zero.
> >
> > > > mark_btf_ld_reg() shouldn't be copying ref_obj_id.
> > > > If it keeps it as zero the problem will not happen, no?
> > >
> > > It is copying it but writing it to parent_ref_obj_id. It keeps ref_obj_id as 0
> > > for all deref pointers.
> > >
> > > r1 = acq(); // r1.ref = acquire_reference_state();
> > > ref = N
> > > r2 = r1->a; // mark_btf_ld_reg -> copy r1.(ref ?: parent_ref) -> so r2.parent_ref = r1.ref
> > > r3 = r2->b; // mark_btf_ld_reg -> copy r2.(ref ?: parent_ref) -> so r3.parent_ref = r2.parent_ref
> > > r4 = r3->c; // mark_btf_ld_reg -> copy r3.(ref ?: parent_ref) -> so r4.parent_ref = r3.parent_ref
> > > rel(r1); // if (reg.ref == r1.ref || reg.parent_ref == r1.ref) invalidate(reg)
> > >
> > > As you see, mark_btf_ld_reg only ever writes to parent_ref_obj_id, not
> > > ref_obj_id. It just copies ref_obj_id when it is set, over parent_ref_obj_id,
> > > and only one of two can be set.
> >
> > I don't understand why such logic is needed.
>
> Ok, let me try to explain once how I arrived at it. If you still don't like it,
> I'll drop it from the series.
>
> So until this patch, when we do the following:
>
> struct nf_conn *ct = bpf_xdp_ct_lookup(...);
> if (ct) {
> struct nf_conn *master = ct->master;
> bpf_ct_release(ct);
> unsigned long status = master->status; // I want to prevent this
> }
>
> ... this will work, which is ok (as in won't crash the kernel) since the load
> will be converted to BPF_PROBE_MEM, but I want to disallow this case since it is
> obviously incorrect.
Finally we're talking! This motivation should have been in the commit log
and this thread wouldn't have been that long.
> The obvious solution (to me) was to kill all registers and stack slots for deref
> pointers.
>
> My first naive solution was to simply copy ref_obj_id on mark_btf_ld_reg, so
> that it can be matched and released from release_reference.
That what I was guessing.
> But then I noticed that if the BTF ID is same, there is no difference when it is
> passed to release kfunc compared to the original register it was loaded from.
>
> struct nf_conn *ct = bpf_xdp_ct_lookup(...);
> if (ct) {
> struct nf_conn *master = ct->master; // copied ref_obj_id
> bpf_ct_release(master); // works, but shouldn't!
> }
>
> So the code needed some way to distinguish this deref pointer that must be
> invalidated only when its 'parent' goes away. Hence the introduction of
> parent_ref_obj_id, and the invariant that only one of ref_obj_id or
> parent_ref_obj_id must be set.
The goal is clear now, but look at it differently:
struct nf_conn *ct = bpf_xdp_ct_lookup(...);
if (ct) {
struct nf_conn *master = ct->master;
struct net *net = ct->ct_net.net;
bpf_ct_release(ct);
master->status; // prevent this ?
net->ifindex; // but allow this ?
}
The verifier cannot statically check this. That's why all such deref
are done via BPF_PROBE_MEM (which is the same as probe_read_kernel).
We must disallow use after free when it can cause a crash.
This case is not the one.
This one, though:
struct nf_conn *ct = bpf_xdp_ct_lookup(...);
struct nf_conn *master = ct->master;
bpf_ct_release(master);
definitely has to be prevented, since it will cause a crash.
As a follow up to this set would be great to allow ptr_to_btf_id
pointers persist longer than program execution.
Users already asked to allow the following:
map_value = bpf_map_lookup_elem(...);
struct nf_conn *ct = bpf_xdp_ct_lookup(...);
map_value->saved_ct = ct;
and some time later in a different or the same program:
map_value = bpf_map_lookup_elem(...);
bpf_ct_release(map_value->saved_ct);
Currently folks work around this deficiency by storing some
sort of id and doing extra lookups while performance is suffering.
wdyt?
Powered by blists - more mailing lists