[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87eeva69lj.fsf@toke.dk>
Date: Tue, 04 Feb 2020 09:27:20 +0100
From: Toke Høiland-Jørgensen <toke@...hat.com>
To: Andrii Nakryiko <andrii.nakryiko@...il.com>
Cc: Daniel Borkmann <daniel@...earbox.net>,
Stephen Hemminger <stephen@...workplumber.org>,
Alexei Starovoitov <ast@...nel.org>,
Martin KaFai Lau <kafai@...com>,
Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
David Miller <davem@...emloft.net>,
Jesper Dangaard Brouer <brouer@...hat.com>,
Networking <netdev@...r.kernel.org>, bpf <bpf@...r.kernel.org>
Subject: Re: [RFC bpf-next 0/5] Convert iproute2 to use libbpf (WIP)
Andrii Nakryiko <andrii.nakryiko@...il.com> writes:
> On Mon, Feb 3, 2020 at 11:34 AM Toke Høiland-Jørgensen <toke@...hat.com> wrote:
>>
>> Andrii Nakryiko <andrii.nakryiko@...il.com> writes:
>>
>> > On Wed, Aug 28, 2019 at 1:40 PM Andrii Nakryiko
>> > <andrii.nakryiko@...il.com> wrote:
>> >>
>> >> On Fri, Aug 23, 2019 at 4:29 AM Toke Høiland-Jørgensen <toke@...hat.com> wrote:
>> >> >
>> >> > [ ... snip ...]
>> >> >
>> >> > > E.g., today's API is essentially three steps:
>> >> > >
>> >> > > 1. open and parse ELF: collect relos, programs, map definitions
>> >> > > 2. load: create maps from collected defs, do program/global data/CO-RE
>> >> > > relocs, load and verify BPF programs
>> >> > > 3. attach programs one by one.
>> >> > >
>> >> > > Between step 1 and 2 user has flexibility to create more maps, set up
>> >> > > map-in-map, etc. Between 2 and 3 you can fill in global data, fill in
>> >> > > tail call maps, etc. That's already pretty flexible. But we can tune
>> >> > > and break apart those steps even further, if necessary.
>> >> >
>> >> > Today, steps 1 and 2 can be collapsed into a single call to
>> >> > bpf_prog_load_xattr(). As Jesper's mail explains, for XDP we don't
>> >> > generally want to do all the fancy rewriting stuff, we just want a
>> >> > simple way to load a program and get reusable pinning of maps.
>> >>
>> >> I agree. See my response to Jesper's message. Note also my view of
>> >> bpf_prog_load_xattr() existence.
>> >>
>> >> > Preferably in a way that is compatible with the iproute2 loader.
>> >> >
>> >
>> > Hi Toke,
>> >
>> > I was wondering what's the state of converting iproute2 to use libbpf?
>> > Is this still something you (or someone else) interested to do?
>>
>> Yeah, it's still on my list; planning to circle back to it once I have
>> finished an RFC implementation for XDP multiprog loading based on the
>> new function-replacing in the kernel.
>>
>> (Not that this should keep anyone else from giving the conversion a go
>> and beating me to it :)).
>>
>> > Briefly re-reading the thread, I think libbpf already has almost
>> > everything to be used by iproute2. You've added map pinning, so with
>> > bpf_map__set_pin_path() iproute2 should be able to specify pinning
>> > path, according to its own logic. The only thing missing that I can
>> > see is ability to specify numa_node, which we should add both to
>> > BTF-defined map definitions (trivial change), as well as probably
>> > expose a method like bpf_map__set_numa_node(struct bpf_map *map, int
>> > numa_node) for non-declarative and non-BTF legacy cases.
>>
>> Yes, adding this to libbpf would be good.
>>
>> > There was concern about supporting "extended" bpf_map_def format of
>> > iproute2 (bpf_elf_map, actually) with extra fields. I think it's
>> > actually easy to handle as is without any extra new APIs.
>> > bpf_object__open() w/ .relaxed_maps = true option will process
>> > compatible 5 fields of bpf_map_def (type, key/value sizes,
>> > max_entries, and map_flags) and will set up corresponding struct
>> > bpf_map entries (but won't create BPF maps in kernel yet). Then
>> > iproute2 can iterate over "maps" ELF section on its own, and see which
>> > maps need to get some more adjustments before load phase: map-in-map
>> > set up, numa node, pinning, etc. All those adjustments can be done
>> > (except for numa yet) through existing libbpf APIs, as far as I can
>> > tell. Once that is taken care of, proceed to bpf_object__load() and
>> > other standard steps. No callbacks, no extra cruft.
>> >
>> > Is there anything else that can block iproute2 conversion to libbpf?
>>
>> I haven't looked into the details since my last RFC conversion series,
>> but from what I recall from that, and what we've been changing in libbpf
>> since, I was basically planning to do what you explained. So while there
>> are some details to work out, I believe it's basically straight forward,
>> and I can't think of anything that should block it.
>>
>
> Great! Just to disambiguate and make sure we are in agreement, my hope
> here is that iproute2 can completely delegate to libbpf all the ELF
> parsing, map creation, program loading, etc (including all the new
> stuff like global variables, etc). And only for legacy maps in
> SEC("maps"), it would have to parse that *single* ELF section (again,
> on its own) and see if there are any extra features of struct
> bpf_elf_map requested (i.e., numa, map-in-map, pinning), and if yes,
> it would use programmatic libbpf APIs to set this up. It might need to
> do additional BPF_PROG_ARRAY set up after BPF programs are loaded
> (because iproute2 has its custom naming-based convention). But
> hopefully we'll encourage people to gradually migrate to BTF-defined
> maps with declarative ways of doing all that.
Yup, that is my hope as well. Let's see how it goes :)
-Toke
Powered by blists - more mailing lists