netdev - Re: [PATCH bpf-next] bpf: libbpf: retry program creation without the name

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <9358ea63-00ca-7acb-668a-d5c5dc5435d8@netronome.com>
Date:   Mon, 26 Nov 2018 19:45:26 +0000
From:   Quentin Monnet <quentin.monnet@...ronome.com>
To:     Vlad Dumitrescu <vlad@...itrescu.ro>
Cc:     sdf@...ichev.me, alexei.starovoitov@...il.com,
        Daniel Borkmann <daniel@...earbox.net>, sdf@...gle.com,
        netdev@...r.kernel.org, ast@...nel.org,
        Vlad Dumitrescu <vladum@...gle.com>
Subject: Re: [PATCH bpf-next] bpf: libbpf: retry program creation without the
 name

2018-11-26 11:08 UTC-0800 ~ Vlad Dumitrescu <vlad@...itrescu.ro>
> On Fri, Nov 23, 2018 at 2:51 AM Quentin Monnet
> <quentin.monnet@...ronome.com> wrote:
>>
>> 2018-11-21 09:28 UTC-0800 ~ Stanislav Fomichev <sdf@...ichev.me>
>>> On 11/21, Quentin Monnet wrote:
>>>> 2018-11-20 15:26 UTC-0800 ~ Stanislav Fomichev <sdf@...ichev.me>
>>>>> On 11/20, Alexei Starovoitov wrote:
>>>>>> On Wed, Nov 21, 2018 at 12:18:57AM +0100, Daniel Borkmann wrote:
>>>>>>> On 11/21/2018 12:04 AM, Alexei Starovoitov wrote:
>>>>>>>> On Tue, Nov 20, 2018 at 01:19:05PM -0800, Stanislav Fomichev wrote:
>>>>>>>>> On 11/20, Alexei Starovoitov wrote:
>>>>>>>>>> On Mon, Nov 19, 2018 at 04:46:25PM -0800, Stanislav Fomichev wrote:
>>>>>>>>>>> [Recent commit 23499442c319 ("bpf: libbpf: retry map creation without
>>>>>>>>>>> the name") fixed this issue for maps, let's do the same for programs.]
>>>>>>>>>>>
>>>>>>>>>>> Since commit 88cda1c9da02 ("bpf: libbpf: Provide basic API support
>>>>>>>>>>> to specify BPF obj name"), libbpf unconditionally sets bpf_attr->name
>>>>>>>>>>> for programs. Pre v4.14 kernels don't know about programs names and
>>>>>>>>>>> return an error about unexpected non-zero data. Retry sys_bpf without
>>>>>>>>>>> a program name to cover older kernels.
>>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Stanislav Fomichev <sdf@...gle.com>
>>>>>>>>>>> ---
>>>>>>>>>>>   tools/lib/bpf/bpf.c | 10 ++++++++++
>>>>>>>>>>>   1 file changed, 10 insertions(+)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
>>>>>>>>>>> index 961e1b9fc592..cbe9d757c646 100644
>>>>>>>>>>> --- a/tools/lib/bpf/bpf.c
>>>>>>>>>>> +++ b/tools/lib/bpf/bpf.c
>>>>>>>>>>> @@ -212,6 +212,16 @@ int bpf_load_program_xattr(const struct bpf_load_program_attr *load_attr,
>>>>>>>>>>>       if (fd >= 0 || !log_buf || !log_buf_sz)
>>>>>>>>>>>               return fd;
>>>>>>>>>>>
>>>>>>>>>>> +     if (fd < 0 && errno == E2BIG && load_attr->name) {
>>>>>>>>>>> +             /* Retry the same syscall, but without the name.
>>>>>>>>>>> +              * Pre v4.14 kernels don't support prog names.
>>>>>>>>>>> +              */
>>>>>>>>>>
>>>>>>>>>> I'm afraid that will put unnecessary stress on the kernel.
>>>>>>>>>> This check needs to be tighter.
>>>>>>>>>> Like E2BIG and anything in the log_buf probably means that
>>>>>>>>>> E2BIG came from the verifier and nothing to do with prog_name.
>>>>>>>>>> Asking kernel to repeat is an unnecessary work.
>>>>>>>>>>
>>>>>>>>>> In general we need to think beyond this single prog_name field.
>>>>>>>>>> There are bunch of other fields in bpf_load_program_xattr() and older kernels
>>>>>>>>>> won't support them. Are we going to zero them out one by one
>>>>>>>>>> and retry? I don't think that would be practical.
>>>>>>>>> I general, we don't want to zero anything out. However,
>>>>>>>>> for this particular problem the rationale is the following:
>>>>>>>>> In commit 88cda1c9da02 we started unconditionally setting {prog,map}->name
>>>>>>>>> from the 'higher' libbpfc layer which breaks users on the older kernels.
>>>>>>>>>
>>>>>>>>>> Also libbpf silently ignoring prog_name is not great for debugging.
>>>>>>>>>> A warning is needed.
>>>>>>>>>> But it cannot be done out of lib/bpf/bpf.c, since it's a set of syscall
>>>>>>>>>> wrappers.
>>>>>>>>>> Imo such "old kernel -> lets retry" feature should probably be done
>>>>>>>>>> at lib/bpf/libbpf.c level. inside load_program().
>>>>>>>>> For maps bpftools calls bpf_create_map_xattr directly, that's why
>>>>>>>>> for maps I did the retry on the lower level (and why for programs I initially
>>>>>>>>> thought about doing the same). However, in this case maybe asking
>>>>>>>>> user to omit 'name' argument might be a better option.
>>>>>>>>>
>>>>>>>>> For program names, I agree, we might think about doing it on the higher
>>>>>>>>> level (although I'm not sure whether we want to have different API
>>>>>>>>> expectations, i.e. bpf_create_map_xattr ignoring the name and
>>>>>>>>> bpf_load_program_xattr not ignoring the name).
>>>>>>>>>
>>>>>>>>> So given that rationale above, what do you think is the best way to
>>>>>>>>> move forward?
>>>>>>>>> 1. Same patch, but tighten the retry check inside bpf_load_program_xattr ?
>>>>>>>>> 2. Move this retry logic into load_program and have different handling
>>>>>>>>>     for bpf_create_map_xattr vs bpf_load_program_xattr ?
>>>>>>>>> 3. Do 2 and move the retry check for maps from bpf_create_map_xattr
>>>>>>>>>     into bpf_object__create_maps ?
>>>>>>>>>
>>>>>>>>> (I'm slightly leaning towards #3)
>>>>>>>>
>>>>>>>> me too. I think it's cleaner for maps to do it in
>>>>>>>> bpf_object__create_maps().
>>>>>>>> Originally bpf.c was envisioned to be a thin layer on top of bpf syscall.
>>>>>>>> Whereas 'smart bits' would go into libbpf.c
>>>>>>>
>>>>>>> Can't we create in bpf_object__load() a small helper bpf_object__probe_caps()
>>>>>>> which would figure this out _once_ upon start with a few things to probe for
>>>>>>> availability in the underlying kernel for maps and programs? E.g. programs
>>>>>>> it could try to inject a tiny 'r0 = 0; exit' snippet where we figure out
>>>>>>> things like prog name support etc. Given underlying kernel doesn't change, we
>>>>>>> would only try this once and it doesn't require fallback every time.
>>>>>>
>>>>>> +1. great idea!
>>>>> Sounds good, let me try to do it.
>>>>>
>>>>> It sounds more like a recent LPC proposal/idea to have some sys_bpf option
>>>>> to query BPF features. This new bpf_object__probe_caps can probably query
>>>>> that in the future if we eventually add support for it.
>>>>>
>>>>
>>>> Hi,
>>>>
>>>> LPC proposal indeed. I've been working on implementing this kind of
>>>> probes in bpftool. I don't probe name support for now (but I can
>>>> certainly add it), but I detect supported program types, map types,
>>>> header functions, and a couple of other parameters. The idea (initially
>>>> from Daniel) was to dump "#define" declarations that could later be
>>>> included in a header file and used for a BPF project (or alternatively,
>>>> JSON output).
>>> Oh, nice, I didn't know someone was already working on it!
>>>
>>>> I felt like bpftool was maybe a better place to do it, as the set of
>>>> probes may grow large (all types, helpers, etc). It might have
>>>> consequences on the running system: for example, if I don't raise the
>>>> rlimit in bpftool before starting the probes, a good half of them fail
>>>> because of consecutive program creation and reclaim delay for locked
>>>> memory usage.
>>> Should we aim for something like build system feature checks? Where we
>>> preserve these probe results between bpftool/libbpf invocations so we
>>> don't re-run them again?
>>>
>>>> So should we start adding probes to libbpf and should I move mine to the
>>>> lib as well, or should the one in the v3 of this series be directed to
>>>> something like bpftool instead?
>>> We need them (well, at least the name checks) for libbpf because that's
>>> what we link against and what we use to load the programs, bpftool is
>>> less of an issue right now. But my patch was mostly a hackish solution
>>> until we get the real feature checks :-)
>>
>> Hi Stanislav,
>> Apologies for the delayed answer, I have been travelling.
>>
>> I don't know if the probes should be "shared" with libbpf. What I had in
>> mind was rather to run them with bpftool and to produce something like a
>> header containing the results of the probes. Then this header could be
>> included in whatever software wants to manage BPF objects, and that
>> software can then decide to call (or not to call) libbpf functions in
>> one way or another, depending on the features supported by the system.
>> It means you have to recompile your program for the target system, or at
>> least to load probe results as a configuration file somehow.
> 
> Will this work for what Stanislav is trying to solve? The 'whatever
> software wants to manage BPF objects' (WS) has no option to choose
> whether the map and program names are passed to kernel. The libbpf API
> used by WS, in this case, is bpf_object__load(), which internally
> decides when names are passed.

Hm that's correct. I had in mind something like
bpf_load_program_xattr(), but it's true that with bpf_object__load() you
do not get to choose if you pass the name or not. Well I suppose we'll
keep some probes in libbpf, bpftool might not be a solution in that case
indeed.

Thanks,
Quentin