[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181121172831.ujaeekkzlfktc3mg@mini-arch.hsd1.ca.comcast.net>
Date: Wed, 21 Nov 2018 09:28:31 -0800
From: Stanislav Fomichev <sdf@...ichev.me>
To: Quentin Monnet <quentin.monnet@...ronome.com>
Cc: Alexei Starovoitov <alexei.starovoitov@...il.com>,
Daniel Borkmann <daniel@...earbox.net>,
Stanislav Fomichev <sdf@...gle.com>, netdev@...r.kernel.org,
ast@...nel.org, vladum@...gle.com
Subject: Re: [PATCH bpf-next] bpf: libbpf: retry program creation without the
name
On 11/21, Quentin Monnet wrote:
> 2018-11-20 15:26 UTC-0800 ~ Stanislav Fomichev <sdf@...ichev.me>
> > On 11/20, Alexei Starovoitov wrote:
> >> On Wed, Nov 21, 2018 at 12:18:57AM +0100, Daniel Borkmann wrote:
> >>> On 11/21/2018 12:04 AM, Alexei Starovoitov wrote:
> >>>> On Tue, Nov 20, 2018 at 01:19:05PM -0800, Stanislav Fomichev wrote:
> >>>>> On 11/20, Alexei Starovoitov wrote:
> >>>>>> On Mon, Nov 19, 2018 at 04:46:25PM -0800, Stanislav Fomichev wrote:
> >>>>>>> [Recent commit 23499442c319 ("bpf: libbpf: retry map creation without
> >>>>>>> the name") fixed this issue for maps, let's do the same for programs.]
> >>>>>>>
> >>>>>>> Since commit 88cda1c9da02 ("bpf: libbpf: Provide basic API support
> >>>>>>> to specify BPF obj name"), libbpf unconditionally sets bpf_attr->name
> >>>>>>> for programs. Pre v4.14 kernels don't know about programs names and
> >>>>>>> return an error about unexpected non-zero data. Retry sys_bpf without
> >>>>>>> a program name to cover older kernels.
> >>>>>>>
> >>>>>>> Signed-off-by: Stanislav Fomichev <sdf@...gle.com>
> >>>>>>> ---
> >>>>>>> tools/lib/bpf/bpf.c | 10 ++++++++++
> >>>>>>> 1 file changed, 10 insertions(+)
> >>>>>>>
> >>>>>>> diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> >>>>>>> index 961e1b9fc592..cbe9d757c646 100644
> >>>>>>> --- a/tools/lib/bpf/bpf.c
> >>>>>>> +++ b/tools/lib/bpf/bpf.c
> >>>>>>> @@ -212,6 +212,16 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
> >>>>>>> if (fd >= 0 || !log_buf || !log_buf_sz)
> >>>>>>> return fd;
> >>>>>>>
> >>>>>>> + if (fd < 0 && errno == E2BIG && load_attr->name) {
> >>>>>>> + /* Retry the same syscall, but without the name.
> >>>>>>> + * Pre v4.14 kernels don't support prog names.
> >>>>>>> + */
> >>>>>>
> >>>>>> I'm afraid that will put unnecessary stress on the kernel.
> >>>>>> This check needs to be tighter.
> >>>>>> Like E2BIG and anything in the log_buf probably means that
> >>>>>> E2BIG came from the verifier and nothing to do with prog_name.
> >>>>>> Asking kernel to repeat is an unnecessary work.
> >>>>>>
> >>>>>> In general we need to think beyond this single prog_name field.
> >>>>>> There are bunch of other fields in bpf_load_program_xattr() and older kernels
> >>>>>> won't support them. Are we going to zero them out one by one
> >>>>>> and retry? I don't think that would be practical.
> >>>>> I general, we don't want to zero anything out. However,
> >>>>> for this particular problem the rationale is the following:
> >>>>> In commit 88cda1c9da02 we started unconditionally setting {prog,map}->name
> >>>>> from the 'higher' libbpfc layer which breaks users on the older kernels.
> >>>>>
> >>>>>> Also libbpf silently ignoring prog_name is not great for debugging.
> >>>>>> A warning is needed.
> >>>>>> But it cannot be done out of lib/bpf/bpf.c, since it's a set of syscall
> >>>>>> wrappers.
> >>>>>> Imo such "old kernel -> lets retry" feature should probably be done
> >>>>>> at lib/bpf/libbpf.c level. inside load_program().
> >>>>> For maps bpftools calls bpf_create_map_xattr directly, that's why
> >>>>> for maps I did the retry on the lower level (and why for programs I initially
> >>>>> thought about doing the same). However, in this case maybe asking
> >>>>> user to omit 'name' argument might be a better option.
> >>>>>
> >>>>> For program names, I agree, we might think about doing it on the higher
> >>>>> level (although I'm not sure whether we want to have different API
> >>>>> expectations, i.e. bpf_create_map_xattr ignoring the name and
> >>>>> bpf_load_program_xattr not ignoring the name).
> >>>>>
> >>>>> So given that rationale above, what do you think is the best way to
> >>>>> move forward?
> >>>>> 1. Same patch, but tighten the retry check inside bpf_load_program_xattr ?
> >>>>> 2. Move this retry logic into load_program and have different handling
> >>>>> for bpf_create_map_xattr vs bpf_load_program_xattr ?
> >>>>> 3. Do 2 and move the retry check for maps from bpf_create_map_xattr
> >>>>> into bpf_object__create_maps ?
> >>>>>
> >>>>> (I'm slightly leaning towards #3)
> >>>>
> >>>> me too. I think it's cleaner for maps to do it in
> >>>> bpf_object__create_maps().
> >>>> Originally bpf.c was envisioned to be a thin layer on top of bpf syscall.
> >>>> Whereas 'smart bits' would go into libbpf.c
> >>>
> >>> Can't we create in bpf_object__load() a small helper bpf_object__probe_caps()
> >>> which would figure this out _once_ upon start with a few things to probe for
> >>> availability in the underlying kernel for maps and programs? E.g. programs
> >>> it could try to inject a tiny 'r0 = 0; exit' snippet where we figure out
> >>> things like prog name support etc. Given underlying kernel doesn't change, we
> >>> would only try this once and it doesn't require fallback every time.
> >>
> >> +1. great idea!
> > Sounds good, let me try to do it.
> >
> > It sounds more like a recent LPC proposal/idea to have some sys_bpf option
> > to query BPF features. This new bpf_object__probe_caps can probably query
> > that in the future if we eventually add support for it.
> >
>
> Hi,
>
> LPC proposal indeed. I've been working on implementing this kind of
> probes in bpftool. I don't probe name support for now (but I can
> certainly add it), but I detect supported program types, map types,
> header functions, and a couple of other parameters. The idea (initially
> from Daniel) was to dump "#define" declarations that could later be
> included in a header file and used for a BPF project (or alternatively,
> JSON output).
Oh, nice, I didn't know someone was already working on it!
> I felt like bpftool was maybe a better place to do it, as the set of
> probes may grow large (all types, helpers, etc). It might have
> consequences on the running system: for example, if I don't raise the
> rlimit in bpftool before starting the probes, a good half of them fail
> because of consecutive program creation and reclaim delay for locked
> memory usage.
Should we aim for something like build system feature checks? Where we
preserve these probe results between bpftool/libbpf invocations so we
don't re-run them again?
> So should we start adding probes to libbpf and should I move mine to the
> lib as well, or should the one in the v3 of this series be directed to
> something like bpftool instead?
We need them (well, at least the name checks) for libbpf because that's
what we link against and what we use to load the programs, bpftool is
less of an issue right now. But my patch was mostly a hackish solution
until we get the real feature checks :-)
Powered by blists - more mailing lists