[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <67a28d535a91396a20e7fb5ff4c322395c947eb8.camel@linux.ibm.com>
Date: Fri, 10 Mar 2023 04:40:11 +0100
From: Ilya Leoshkevich <iii@...ux.ibm.com>
To: Joanne Koong <joannelkoong@...il.com>
Cc: Alexei Starovoitov <alexei.starovoitov@...il.com>,
bpf <bpf@...r.kernel.org>,
Martin KaFai Lau <martin.lau@...nel.org>,
Andrii Nakryiko <andrii@...nel.org>,
Alexei Starovoitov <ast@...nel.org>,
Kumar Kartikeya Dwivedi <memxor@...il.com>,
Daniel Borkmann <daniel@...earbox.net>,
Network Development <netdev@...r.kernel.org>,
Toke Høiland-Jørgensen <toke@...nel.org>,
Stanislav Fomichev <sdf@...gle.com>
Subject: Re: [PATCH v13 bpf-next 10/10] selftests/bpf: tests for using
dynptrs to parse skb and xdp buffers
On Thu, 2023-03-09 at 00:13 -0800, Joanne Koong wrote:
> On Wed, Mar 8, 2023 at 6:24 AM Ilya Leoshkevich <iii@...ux.ibm.com>
> wrote:
> >
> > On Tue, 2023-03-07 at 23:22 -0800, Joanne Koong wrote:
> > > On Tue, Mar 7, 2023 at 5:55 PM Ilya Leoshkevich
> > > <iii@...ux.ibm.com>
> > > wrote:
> > > >
> > > > On Wed, Mar 01, 2023 at 08:28:40PM -0800, Joanne Koong wrote:
> > > > > On Wed, Mar 1, 2023 at 10:08 AM Alexei Starovoitov
> > > > > <alexei.starovoitov@...il.com> wrote:
> > > > > >
> > > > > > On Wed, Mar 1, 2023 at 7:51 AM Joanne Koong
> > > > > > <joannelkoong@...il.com> wrote:
> > > > > > >
> > > > > > > 5) progs/dynptr_success.c
> > > > > > > * Add test case "test_skb_readonly" for testing
> > > > > > > attempts
> > > > > > > at writes
> > > > > > > on a prog type with read-only skb ctx.
> > > > > > > * Add "test_dynptr_skb_data" for testing that
> > > > > > > bpf_dynptr_data isn't
> > > > > > > supported for skb progs.
> > > > > >
> > > > > > I added
> > > > > > +dynptr/test_dynptr_skb_data
> > > > > > +dynptr/test_skb_readonly
> > > > > > to DENYLIST.s390x and applied.
> > > > >
> > > > > Thanks, I'm still not sure why s390x cannot load these
> > > > > programs.
> > > > > It is
> > > > > being loaded in the same way as other tests like
> > > > > test_parse_tcp_hdr_opt() are loading programs. I will keep
> > > > > looking
> > > > > some more into this
> > > >
> > > > Hi,
> > > >
> > > > I believe the culprit is:
> > > >
> > > > insn->imm = BPF_CALL_IMM(bpf_dynptr_from_skb_rdonly);
> > > >
> > > > s390x needs to know the kfunc model in order to emit the call
> > > > (like
> > > > i386), but after this assignment it's no longer possible to
> > > > look it
> > > > up in kfunc_tab by insn->imm. x86_64 does not need this,
> > > > because
> > > > its
> > > > ABI is exactly the same as BPF ABI.
> > > >
> > > > The simplest solution seems to be adding an artificial
> > > > kfunc_desc
> > > > like this:
> > > >
> > > > {
> > > > .func_model = desc->func_model, /* model must be
> > > > compatible */
> > > > .func_id = 0, /* unused at this
> > > > point */
> > > > .imm = insn->imm, /* new target */
> > > > .offset = 0, /* unused at this
> > > > point */
> > > > }
> > > >
> > > > here and also after this assignment:
> > > >
> > > > insn->imm = BPF_CALL_IMM(xdp_kfunc);
> > > >
> > > > What do you think?
> > >
> > > Ohh interesting! This makes sense to me. In particular, you're
> > > referring to the bpf_jit_find_kfunc_model() call in
> > > bpf_jit_insn()
> > > (in
> > > arch/s390/net/bpf_jit_comp.c) as the one that fails out whenever
> > > insn->imm gets set, correct?
> >
> > Precisely.
> >
> > > I like your proposed solution, I agree that this looks like the
> > > simplest, though maybe we should replace the existing kfunc_desc
> > > instead of adding it so we don't have to deal with the edge case
> > > of
> > > reaching MAX_KFUNC_DESCS? To get the func model of the new insn-
> > > >imm,
> >
> > I wonder whether replacement is safe? This would depend on the
> > following functions returning the same value for the same inputs:
> >
> > - may_access_direct_pkt_data() - this looks ok;
> > - bpf_dev_bound_resolve_kfunc() - I'm not so sure, any insights?
>
> For the bpf_dev_bound_resolve_kfunc() case (in fixup_kfunc_call()), I
> think directly replacing the kfunc_desc here is okay because
> bpf_dev_bound_resolve_kfunc() is findingthe target device-specific
> version of the kfunc (if it exists) to replace the generic version of
> the kfunc with, and we're using that target device-specific version
> of
> the kfunc as the new updated insn->imm to call
I'm worried that its return value is going to change while we are
doing the rewriting. It looks as if
__bpf_offload_dev_netdev_unregister() can cause this. So if we have
two instructions that use the same generic kfunc, they may end up
pointing to two different device-specific kfuncs, and the kfunc_tab
will contain only one of the two.
This sounds dangerous, but maybe I don't see some safeguard that
already prevents or mitigates the effects of this?
Stanislav, could you as the bpf_dev_bound_resolve_kfunc() author
give your opinion please? I've seen your comment:
+ /* We don't hold bpf_devs_lock while resolving several
+ * kfuncs and can race with the unregister_netdevice().
+ * We rely on bpf_dev_bound_match() check at attach
+ * to render this program unusable.
+ */
and I'm wondering whether you meant bpf_prog_dev_bound_match(), and
whether it protects against the ABA problem, i.e., if
__bpf_offload_dev_netdev_unregister() is called twice, and we get
aux->offload and aux->offload->netdev at the same addresses?
> > If it's not, then MAX_KFUNC_DESCS indeed becomes a concern.
> >
> > > it seems pretty straightforward, it looks like we can just use
> > > btf_distill_func_proto(). or call add_kfunc_call() directly,
> > > which
> > > would do everything needed, but adds an additional unnecessary
> > > sort
> > > and more overhead for replacing (eg we'd need to first swap the
> > > old
> > > kfunc_desc with the last tab->descs[tab->nr_descs] entry and then
> > > delete the old kfunc_desc before adding the new one). What are
> > > your
> > > thoughts?
> >
> > Is there a way to find BTF by function pointer?
> > IIUC bpf_dev_bound_resolve_kfunc() can return many different
> > things,
> > and btf_distill_func_proto() and add_kfunc_call() need BTF.
> > A straightforward way that immediately comes to mind is to do
> > kallsyms
> > lookup and then resolve by name, but this sounds clumsy.
> >
>
> I'm not sure whether there's a way to find the function's BTF by its
> pointer, but I think maybe we can use the vmlinux btf (which we can
> get through the bpf_get_btf_vmlinux() api) to get the func proto?
The device-specific function may come from a kernel module (e.g.,
veth). But on second thought we don't need this at all; we should
really just take func_model of the generic function, that we already
have. If it is not the same as the model of the device-specific
function, it must be a bug.
> > I've been looking into this in context of fixing (kfunc
> > __bpf_call_base) not fitting into 32 bits on s390x. A solution that
>
> Sorry, I'm not fully understanding - can you elaborate a little on
> what the issue is? why doesn't the __bpf_call_base address fit on
> s390x? my understanding is that s390x is a 64-bit architecture?
On s390x modules and kernel are far away from each other, so
BPF_CALL_IMM() may return ~40 significant bits. This makes the
insn->imm rewriting trick unusable, because insn->imm is just 32 bits
and cannot be extended. There is even a safeguard against this in
add_kfunc_call() ("address of kernel function %s is out of range"
check).
I had a patch that kept BTF ID in insn->imm, but it was decided that
since it required adjusting several JITs, we should not be doing it.
When the s390x JIT sees a kfunc call, it needs to find the respective
kfunc's address and model. Normally this is done using kfunc_tab
lookup. kfunc_tab is indexed by insn->imm values, which we cannot use
for reasons outlined above. Hence the idea below: create another
(unfortunately much less memory-efficient) kfunc_tab indexed by insn
numbers.
Conveniently, this would also solve the problem that we are seeing
here.
> > would solve both problems that I'm currently thinking about is to
> > associate
> >
> > struct {
> > struct btf_func_model *m;
> > unsigned long addr;
> > } kfunc_callee;
> >
> > with every insn - during verification it could live in
> > bpf_insn_aux_data, during jiting in bpf_prog, and afterwards it can
> > be freed. Any thoughts about this?
Powered by blists - more mailing lists