[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211231165629.fnaolmwjoqt6hhcb@ast-mbp.dhcp.thefacebook.com>
Date: Fri, 31 Dec 2021 08:56:29 -0800
From: Alexei Starovoitov <alexei.starovoitov@...il.com>
To: Kumar Kartikeya Dwivedi <memxor@...il.com>
Cc: bpf@...r.kernel.org, netdev@...r.kernel.org,
netfilter-devel@...r.kernel.org,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Andrii Nakryiko <andrii@...nel.org>,
Martin KaFai Lau <kafai@...com>,
Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
John Fastabend <john.fastabend@...il.com>,
Maxim Mikityanskiy <maximmi@...dia.com>,
Pablo Neira Ayuso <pablo@...filter.org>,
Florian Westphal <fw@...len.de>,
Jesper Dangaard Brouer <brouer@...hat.com>,
Toke Høiland-Jørgensen <toke@...hat.com>
Subject: Re: [PATCH bpf-next v5 2/9] bpf: Prepare kfunc BTF ID sets when
parsing kernel BTF
On Fri, Dec 31, 2021 at 09:15:07AM +0530, Kumar Kartikeya Dwivedi wrote:
> On Fri, Dec 31, 2021 at 07:53:07AM IST, Alexei Starovoitov wrote:
> > On Thu, Dec 30, 2021 at 08:06:58AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > [...]
> > > +
> > > + /* Identify type */
> > > + symbol_name += pfx_size;
> > > + if (!*symbol_name) {
> > > + bpf_log(bdata->log, "incomplete kfunc btf_id_set specification: %s\n", orig_name);
> > > + return -EINVAL;
> > > + }
> > > + for (i = 0; i < ARRAY_SIZE(kfunc_type_str); i++) {
> > > + pfx_size = strlen(kfunc_type_str[i]);
> > > + if (strncmp(symbol_name, kfunc_type_str[i], pfx_size))
> > > + continue;
> > > + break;
> > > + }
> > > + if (i == ARRAY_SIZE(kfunc_type_str)) {
> > > + bpf_log(bdata->log, "invalid type '%s' for kfunc_btf_id_set %s\n", symbol_name,
> > > + orig_name);
> > > + return -EINVAL;
> > > + }
> > > + type = i;
> > > +
> > > + return btf_populate_kfunc_sets(bdata->btf, bdata->log, hook, type, set);
> >
> > I really like the direction taken by patches 2 and 3.
> > I think we can save the module_kallsyms_on_each_symbol loop though.
> > The registration mechanism, like:
> > register_kfunc_btf_id_set(&prog_test_kfunc_list, &bpf_testmod_kfunc_btf_set);
> > doesn't have to be complete removed.
> > It can replaced with a sequence of calls:
> > btf_populate_kfunc_sets(btf, hook, type, set);
> > from __init of the module.
> > The module knows its 'hook' and 'type' and set==&bpf_testmod_kfunc_btf_set.
> > The bpf_testmod_init() can call btf_populate_kfunc_sets() multiple
> > times to popualte sets into different hooks and types.
> > There is no need to 'unregister' any more.
> > And the patch 1 will no longer be necessary, since we don't need to iterate
> > every symbol of the module with module_kallsyms_on_each_symbol().
> > No need to standardize on the prefix and kfunc_[hook|type]_str,
> > though it's probably good idea anyway across module BTF sets.
> > The main disadvantage is that we lose 'log' in btf_populate_kfunc_sets(),
> > since __init of the module cannot have verifier log at that point.
> > But it can stay as 'ret = -E2BIG;' without bpf_log() and module registration
> > will fail in such case or we just warn inside __init if btf_populate_kfunc_sets
> > fails in the rare case.
> > wdyt?
> >
>
> Sounds good, I'll make this change in the next version. Should I also drop
> kallsyms_on_each_symbol for vmlinux BTF? I think we can use initcall for it too,
> right?
Yep. Of course. For consistency.
> > > +}
> > > +
> > > +static int btf_parse_kfunc_sets(struct btf *btf, struct module *mod,
> > > + struct bpf_verifier_log *log)
> > > +{
> > > + struct btf_parse_kfunc_data data = { .btf = btf, .log = log, };
> > > + struct btf_kfunc_set_tab *tab;
> > > + int hook, type, ret;
> > > +
> > > + if (!btf_is_kernel(btf))
> > > + return -EINVAL;
> > > + if (WARN_ON_ONCE(btf_is_module(btf) && !mod)) {
> > > + bpf_log(log, "btf internal error: no module for module BTF %s\n", btf->name);
> > > + return -EFAULT;
> > > + }
> > > + if (mod)
> > > + ret = module_kallsyms_on_each_symbol(mod, btf_parse_kfunc_sets_cb, &data);
> > > + else
> > > + ret = kallsyms_on_each_symbol(btf_parse_kfunc_sets_cb, &data);
> > > +
> > > + tab = btf->kfunc_set_tab;
> > > + if (!ret && tab) {
> > > + /* Sort all populated sets */
> > > + for (hook = 0; hook < ARRAY_SIZE(tab->sets); hook++) {
> > > + for (type = 0; type < ARRAY_SIZE(tab->sets[0]); type++) {
> > > + struct btf_id_set *set = tab->sets[hook][type];
> > > +
> > > + /* Not all sets may be populated */
> > > + if (!set)
> > > + continue;
> > > + sort(set->ids, set->cnt, sizeof(set->ids[0]), btf_kfunc_ids_cmp,
> > > + NULL);
> >
> > Didn't resolve_btfid store ids already sorted?
> > Why another sort is needed?
>
> Because it might be possible while iterating over symbols (be it vmlinux or
> module), we combine sets like [1, 4, 6] and [2, 3, 5] into [1, 4, 6, 2, 3, 5],
> into the set for a certain hook, type, so to enable bsearch we do one final sort
> after possible sets have been populated.
>
> > Because btf_populate_kfunc_sets() can concatenate the sets?
> > But if we let __init call it directly the module shouldn't use
> > the same hook/type combination multiple times with different sets.
> > So no secondary sorting will be necessary?
> >
>
> Yes, if we make it that only one call per hook/type can be done, then this
> shouldn't be needed, but e.g. if each file has a set for some hook and uses late
> initcall to do registration, then it will be needed again for the same reason.
You mean when module has multiple files and these sets have to be in different files?
> We can surely catch the second call (see if tab->[hook][type] != NULL).
Good idea. Let's do this for now.
One .c per module with such sets is fine, imo.
> > > This commit prepares the BTF parsing functions for vmlinux and module
> > > BTFs to find all kfunc BTF ID sets from the vmlinux and module symbols
> > > and concatentate all sets into single unified set which is sorted and
> > > keyed by the 'hook' it is meant for, and 'type' of set.
> >
> > 'sorted by hook' ?
> > The btf_id_set_contains() need to search it by 'id', so it's sorted by 'id'.
>
> Yeah, it needs a comma after 'sorted' :).
>
> > Is it because you're adding mod's IDs to vmlinux IDs and re-sorting them?
>
> No, I'm not mixing them. We only add same module's/vmlinux's IDs to its struct
> btf, then sort each set (for hook,type pair). find_kfunc_desc_btf gives us the
> BTF, then we can directly do what is essentially a single btf_id_set_contains
> call, so it is not required to research in vmlinux BTF. The BTF associated with
> the kfunc call is known.
Got it. Ok.
Powered by blists - more mailing lists