[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4Bza1T6h+MWadVjuCrPCY7pkyK9kw-fPdaRx2v3yzSsmcbg@mail.gmail.com>
Date: Tue, 19 Nov 2019 07:42:16 -0800
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Alexei Starovoitov <alexei.starovoitov@...il.com>
Cc: Andrii Nakryiko <andriin@...com>, bpf <bpf@...r.kernel.org>,
Networking <netdev@...r.kernel.org>,
Alexei Starovoitov <ast@...com>,
Daniel Borkmann <daniel@...earbox.net>,
Kernel Team <kernel-team@...com>
Subject: Re: [PATCH bpf-next 5/6] libbpf: support libbpf-provided extern variables
On Mon, Nov 18, 2019 at 10:57 PM Andrii Nakryiko
<andrii.nakryiko@...il.com> wrote:
>
> On Mon, Nov 18, 2019 at 7:21 PM Alexei Starovoitov
> <alexei.starovoitov@...il.com> wrote:
> >
> > On Sat, Nov 16, 2019 at 11:08:06PM -0800, Andrii Nakryiko wrote:
> > > Add support for extern variables, provided to BPF program by libbpf. Currently
> > > the following extern variables are supported:
> > > - LINUX_KERNEL_VERSION; version of a kernel in which BPF program is
> > > executing, follows KERNEL_VERSION() macro convention;
> > > - CONFIG_xxx values; a set of values of actual kernel config. Tristate,
> > > boolean, and integer values are supported. Strings are not supported at
> > > the moment.
> > >
> > > All values are represented as 64-bit integers, with the follow value encoding:
> > > - for boolean values, y is 1, n or missing value is 0;
> > > - for tristate values, y is 1, m is 2, n or missing value is 0;
> > > - for integers, the values is 64-bit integer, sign-extended, if negative; if
> > > config value is missing, it's represented as 0, which makes explicit 0 and
> > > missing config value indistinguishable. If this will turn out to be
> > > a problem in practice, we'll need to deal with it somehow.
> >
> > I read that statement as there is no extensibility for such api.
>
> What do you mean exactly?
>
> Are you worried about 0 vs undefined case? I don't think it's going to
> be a problem in practice. Looking at my .config, I see that integer
> config values set to their default values are still explicitly
> specified with those values. E.g.,
>
> CONFIG_HZ_1000=y
> CONFIG_HZ=1000
>
> CONFIG_HZ default is 1000, if CONFIG_HZ_1000==y, but still I see it
> set. So while I won't claim that it's the case for any possible
> integer config, it seems to be pretty consistent in practice.
>
> Also, I see a lot of values set to explicit 0, like:
>
> CONFIG_BASE_SMALL=0
>
> So it seems like integers are typically spelled out explicitly in real
> configs and I think this 0 default is pretty sane.
>
> Next, speaking about extensibility. Once we have BTF type info for
> externs, our possibilities are much better. It will be possible to
> support bool, int, in64 for the same bool value. Libbpf will be able
> to validate the range and fail program load if declared extern type
> doesn't match actual value type and value range. So I think
> extensibility is there, but right now we are enforcing (logically)
> everything to be uin64_t. Unfortunately, with the way externs are done
> in ELF, I don't know neither type nor size, so can't be more strict
> than that.
>
> If we really need to know whether some config value is defined or not,
> regardless of its value, we can have it by convention. E.g.,
> CONFIG_DEFINED_XXX will be either 0 or 1, depending if corresponding
> CONFIG_XXX is defined explicitly or not. But I don't want to add that
> until we really have a use case where it matters.
>
> >
> > > Generally speaking, libbpf is not aware of which CONFIG_XXX values is of which
> > > expected type (bool, tristate, int), so it doesn't enforce any specific set of
> > > values and just parses n/y/m as 0/1/2, respectively. CONFIG_XXX values not
> > > found in config file are set to 0.
> >
> > This is not pretty either.
>
> What exactly: defaulting to zero or not knowing config value's type?
> Given all the options, defaulting to zero seems like the best way to
> go.
>
> >
> > > +
> > > + switch (*value) {
> > > + case 'n':
> > > + *ext_val = 0;
> > > + break;
> > > + case 'y':
> > > + *ext_val = 1;
> > > + break;
> > > + case 'm':
> > > + *ext_val = 2;
> > > + break;
>
> reading some more code from scripts/kconfig/symbol.c, I'll need to
> handle N/Y/M and 0x hexadecimals, will add in v2 after collecting some
> more feedback on this version.
>
> > > + case '"':
> > > + pr_warn("extern '%s': strings are not supported\n",
> > > + ext->name);
> > > + err = -EINVAL;
> > > + goto out;
> > > + default:
> > > + errno = 0;
> > > + *ext_val = strtoull(value, &value_end, 10);
> > > + if (errno) {
> > > + err = -errno;
> > > + pr_warn("extern '%s': failed to parse value: %d\n",
> > > + ext->name, err);
> > > + goto out;
> > > + }
> >
> > BPF has bpf_strtol() helper. I think it would be cleaner to pass whatever
> > .config has as bytes to the program and let program parse n/y/m, strings and
> > integers.
>
> Config value is not changing. This is an incredible waste of CPU
> resources to re-parse same value over and over again. And it's
> incredibly much worse usability as well. Again, once we have BTF for
> externs, we can just declare values as const char[] and then user will
> be able to do its own parsing. Until then, I think pre-parsing values
> into convenient u64 types are much better and handles all the typical
> cases.
One more thing I didn't realize I didn't state explicitly, because
I've been thinking and talking about that for so long now, that it
kind of internalized completely.
These externs, including CONFIG_XXX ones, are meant to interoperate
nicely with field relocations within BPF CO-RE concept. They are,
among other things, are meant to disable parts of BPF program logic
through verifier's dead code elimination by doing something like:
if (CONFIG_SOME_FEATURES_ENABLED) {
BPF_CORE_READ(t, some_extra_field);
/* or */
bpf_helper_that_only_present_when_feature_is_enabled();
} else {
/* fallback logic */
}
With CONFIG_SOME_FEATURES_ENABLED not being a read-only integer
constant when BPF program is loaded, this is impossible. So it
absolutely must be some sort of easy to use integer constant. With BTF
for externs we can have more flexibility in the future, of course.
>
> >
> > LINUX_KERNEL_VERSION is a special case and can stay as u64.
> >
Powered by blists - more mailing lists